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1 Introduction 

The development of natural language process- 
ing (henceforth, nlp) systems has reached 
the stage where concentrated efforts are nec- 
essary in the area of representing more 'ab- 
stract', more 'knowledge'-related bodies of in- 
formation. It has been accepted that with- 
out substantial bodies of background infor- 
mation concerning commonsense, everyday 
knowledge about the world or detailed in- 
formation concerning particular domains of 
application, it will not be possible to con- 
struct systems that can support the use of 
natural language. Systems need to repre- 
sent concrete details of the 'worlds' that their 
texts describe: for example, the resolution of 
anaphors, the induction of text coherence by 
recognizing regularities present in the world 
and not in the text, the recognition of plans 
by knowing what kinds of plans make sense for 
speakers and hearers in real situations, etc. all 
require world modelling to various depths. 

This need creates two interrelated problem 
areas. The first problem is how knowledge 
of the world — be it general, commonsense 
knowledge or specialized knowledge concern- 
ing some particular domain — is to be rep- 
resented. The second problem is how such 
organizations of knowledge are to be related 
to linguistic system levels of organization such 
as grammar and lexis. For both problem ar- 
eas the concept of ontologies for nlp has 
been suggested to be of potential value. Very 
generally, an ontology offers a 'conceptual' 
framework for the representation of informa- 
tion — a framework that is sufficiently gen- 
eral, but also sufficiently detailed, to provide 
a rich supportive scaffolding for the construc- 
tion of models of the world. The design of 
such ontologies constitutes an area of concern 
that is coming to be known as ontological en- 



gineering (e.g., 


Nirenburg and Raskin, 1987, 


Lenat and Guha, 1988 


, Simmons, 1991 . As 



we shall see below, most systems that deal 
currently with nlp already adopt some kind 
of ontology for their more abstract levels of 
information. However, theoretical principles 
for the design and development of ontologies 
meeting the goals of generality and detail re- 
main weak. This is due not only to a lack 
of theoretical accounts at these more rarificd 
abstract levels of information, but also to the 
co-existence of a range of, sometimes poorly 



differentiated, functions such bodies of infor- 
mation are expected to fulfill. 

The following list gives an idea of the range 
of functions adopted in nlp. Ontologies are 
often expected to fulfill at least one (and often 
more) of: 

• organizing 'world knowledge', 

• organizing the world itself, 

• organizing 'meaning' or 'semantics' of 
natural language expressions, 

• providing an interface between system 
external components, domain models, 
etc. and nlp linguistic components, 

• ensuring expressability of input expres- 
sions, 

• offering an interlingua for machine trans- 
lation, 

• supporting the construction of 'concep- 
tual dictionaries'. 

Moreover, an ontology is seen as a very gen- 
eral organizational device: i.e., one that pro- 
vides a classification system for whatever area 
of application the ontology is applied to. The 
organizational resource offered by an ontology 
has to be re-usable. But it is an open issue as 
to what extent the kinds of organization listed 
here overlap. It cannot be taken for granted 
that they all refer to the same level of ab- 
stract description. It can also not be taken 
for granted that there is unity concerning the 
tasks that are involved in such descriptions. 
This can be seen in the following statement 
from Hobbs. 

'Semantics is the attempted specifica- 
tion of the relation between language 
and the world. However, this requires 
a theory of the world. There is a spec- 
trum of choices one can make in this 
regard. At one end of the spectrum — 
let's say the right end — one can adopt 
the "correct" theory of the world, the 
one given by quantum mechanics and 
the other sciences. If one does this, se- 
mantics becomes impossible because it 
is no less than all of science. . . There's 
too much of a mismatch between the 
way we view the world and the way 
the world really is. At the left end, 
one can assume a theory of the world 
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that is isomorphic to the way we talk 
about it. ...Most activity in seman- 
tics today is slightly to the right of 
the extreme left end of this spectrum. 
... it fails to move far enough away 
from language to represent significant 
progress to wards the rig ht end of the 
spectrum.' | Hobbs, 1985| , ] 



It probably does not make sense, therefore, 
to talk of a generalized classification system 
without first fixing more precisely the nature 
of its intended function. A further problem is 
that the first of the desired functions above, 
organizing world knowledge, is often taken to 
be definitional for an ontology]^] However, the 
world — i.e., psychological, logical, or philo- 
sophical views of the world — has not proved 
to be very constraining as to what knowledge 
organizations it requires. 'Ontologies' built on 
the basis of such constraints are, as we shall 
see below, underconstrained and there has ac- 
cordingly been no achievement of the large 
scale resources necessary for re-use across NLP 
systems. 

The main purpose of this paper is to add 
a further round of discussion to that con- 
cerning the design and construction of ontolo- 
gies for NLP. The paper is explicitly explo- 
rative, building on experience in the definition 
and use of such ontologies for text generation. 
The paper is intended to stimulate discussion, 
rather than present solutions — although I 
do conclude with suggestions for certain lines 
of theoretically motivated methodological de- 
velopment for future ontologies. The basic 
path taken in the paper will be to differenti- 
ate among the distinct functions that ontolo- 
gies may serve in order to be better able to set 
out principles and constraints for the design of 
abstract levels of knowledge organization that 
can serve as ontologies appropriate for NLP. 
Seen in more detail, the paper is organized as 
follows. 

First, I discuss the role of language as a pos- 
sible motivating force for designing and pop- 
ulating ontologies. Second, I introduce sev- 
eral of the most extensive ontologies that are 
currently to be found in nlp systems, char- 
acterizing their precise function and motiva- 
tion within their respective systems. Third, 
I relate the distinct types of ontology discov- 

2 Or the second may be claimed to be the real task 
— however, as Hobbs points out, this actually comes 
closer to the first position. 



ered to possible general linguistic theories that 
would support them. It is my contention that 
many principles of organization follow directly 
from the position of suggested bodies of infor- 
mation in the linguistic system as a whole and 
that recognizing this allows efforts in the def- 
inition and construction of such bodies of or- 
ganization to be directed more appropriately 
than has hitherto been the case. For any on- 
tology that is proposed, therefore, it should 
be possible to relate its properties back to a 
motivating linguistic theory. I argue that the 
evidence that we now have from the more ex- 
tensive attempts at ontology construction sug- 
gests strongly that a richly stratified model 
of the linguistic system is required in order 
to achieve the degree of constraint that we 
need for attacking large-scale, re-usable on- 
tology construction. Fourth, I show how the 
ontology of the Penman text generation sys- 
tem — that has been developed largely as 
an instantiation of the highly stratified the- 
ory of systemic-functional linguistics — al- 
ready answers many of the criticisms that 
have been raised against other ontologies. I 
argue that although these criticisms are of- 
ten based on largely post hoc, methodological 
grounds, the vast majority of them also fol- 
low directly from the properties of the linguis- 
tic system and so could (and arguably should) 
have been made prior to attempting ontology 
construction. This can be seen in the proper- 
ties of the Penman ontology, whose very de- 
sign avoids significant criticisms levelled else- 
where. Finally, I suggest how ontology de- 
sign could be improved yet further by taking 
into consideration more input from linguistic 
theory. The Penman ontology, for example, 
is only a partial instantiation of the theoret- 
ical principles underlying it and it is possible 
to show that problems enter into the account 
precisely where the ontology falls short of the 
theoretical specification. 

In general, then, this paper is intended not 
only to improve our understanding of what 
kinds of bodies of information can stand as 
ontologies of various kinds and how such bod- 
ies of information relate to other resources in 
the computational representation of the lin- 
guistic system, but also to make the point that 
appropriate views of the rich dimensions of or- 
ganization exhibited by the linguistic system 
can go a long way to improving our initial 
design specifications for NLP systems. They 
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should, therefore, always be considered very 
early on in system construction and computa- 
tional theory development. 



2 The role of language in 
ontology justification 

As mentioned above, the move to consider NLP 
systems that require information over and be- 
yond that attributable to surface syntax has 
raised two problems: how to organize that in- 
formation and how to relate that information 
with the less abstract levels of the linguistic 
system. The first problem is typically consid- 
ered in more detail in approaches where the 
operation of a system in some specified do- 
main is the central goal; the second usually 
arises in systems which attempt to model the 
linguistic system itself, focusing less closely on 
the embedding in any particular specified do- 
main of application. 

One common source for knowledge con- 
struction and representation that is found in 
approaches to the first problem is earlier work 
in artificial intelligence (ai). Even early Ai 
reasoning programs needed to represent the 
state of the world in which the programs were 
to operate. This has given rise to the ar- 
eas of domain modelling and common- 
sense reasoning which are responsible for 
representing concrete details of aspects of the 
world. The enterprise of world modelling 
clearly has many similarities with the require- 
ments of sophisticated NLP systems and there 
has naturally been an influx of techniques and 
attitudes concerning ontology design from the 
Ai context. 

This has proved most successful in the 
cross-over of techniques of knowledge rep- 
resentation in Ai to techniques for repre- 
senting linguistic information. The simi- 
larity between structured inheritance knowl- 
edge representation languages such as kl- 
ONE Brachman and Schmolze, 1985 and its 
descendents and current typed feature logics 
(e.g., 



Smolka and Ait-Kaci, 1989 



|Smolka, 1989 



Nebel et ai, 1991 ) is an active area of re- 
search. A basic model for the representation 
of ontologies can now assume minimally that 
a subsumption lattice over sorts is defined, 
probably with some mechanism correspond- 



ing to the structured inheritance of role in- 
formation associated with the sorts, and pos- 
sibly additional axioms, or particular infer- 
ences, licenced by specified combinations of 
sorts. This will be the representational basis 
for ontologies of all kinds that I will assume 
throughout this paper. 

In contrast to this concensus, attempts to 
decide exactly which sorts make sense for 
an ontology based on Ai 'knowledge engi- 
neering' principles have been less successful. 
Although the effort-intensive nature of do- 
main modelling naturally calls for consider- 
ation of the re-usability of components of 
the knowledge represented across distinct do- 
mains, the ability of Ai-centered approaches 
to come up with such general organiza- 
tions has been limited. Some of the ear- 
liest work in this area was that on 'naive 
physics' (e.g., [ Hayes, 1979 Hayes, 1985 |): 
here the aim was to capture the under- 
lying 'general knowledge' that people have 
about physical objects and substances in 
the world; similar investigations are reported 
in, for example, [Hobbs and Moore, 198!:, 
Hobbs et ai, 1987 1 , and there are naturally 



also connections to be drawn with other work 
in semantic and 'conceptual' representation in 
Ai, e.g., [Schank and Abelson, 1977|. Further 



good examples of systems that require de- 
tailed real 'knowledge' in particular domains 
are expert systems; here also there is still lit- 
tle shareability across domain models. The 
detailed organization of such systems' knowl- 
edge is typically unique to particular applica- 
tion domains and shows relatively little cross- 
domain re-usability. 

We can in part explain this by consider- 
ing the relative importance assigned to the 
distinct functions that such domain models 
in Ai are to fulfill. For example, when con- 
structing a knowledge source whose primary 
function is to support the particular inferences 
that a given system needs to draw, it is log- 
ical that the organization of that knowledge 
be tailored with this goal in mind. This usu- 
ally leads, however, to nongeneralizeable rep- 
resentational requirements because the infer- 
ences that distinct systems are to draw have 
not been related. The relatively small scale of 
most of this work to date has furthermore lim- 
ited the effectiveness and urgency of investiga- 
tions into re-usability: the cost of construct- 
ing domain models from scratch has not been 
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prohibitively high. This cost-equation quickly 
changes once more realistically sized bodies 
of information are considered. It quickly be- 
comes much more important that detailed or- 
ganizations of general knowledge applicable to 
many domains are available so as to reduce the 
work involved when moving to new domains. 

The most extensive attempt to cre- 
ate a general scaffolding for represent- 
ing general, background knowledge of the 
world b ased on Ai techniques is the CYC 
project |Lenat and Guha, 1988 . The size of 



this project (initial projections were for a base 
level of 10,000,000 entries) of necessity forces 
a sharp awareness of the need to have an or- 
ganization for knowledge that is detailed and 
general enough to provide sufficient scaffold- 
ing for supporting large-scale bodies of infor- 
mation in accessible and usable ways. With- 
out clear principles both for the organiza- 
tion of such knowledge and for the selection 
of the information to be represented, the re- 
sult would be disastrous: poorly organized 
knowledge will be inadequate both theoreti- 
cally, in that it fails to capture significant gen- 
eralizations, and practically, in that it fails 
to be usable as a resource. The procedure 
followed in CYC is to divide up types of en- 
tities into categories that appear to behave 
differently, i.e., concepts are classified accord- 
ing to the kinds of inferences that they al- 
low to be drawn about themselves. Problem- 
atic here, therefore, is precisely which kinds 
of inferences are to be taken as definitional. 
This does not appear to have been made ex- 
plicit and so the procedure does not provide a 
particularly sound methodology. The result- 
ing domain-independent, and hence re-usable, 
portion of the CYC ontology is accordingly not 
very deep, somewhat tangled, and supports 
limited inferences. It then becomes increas- 
ingly necessary to raise questions concerning 
the consistency of distinct areas of knowledge 
represented and, consequently, how one can 
use that knowledge. 

It needs to be recognized that it is essen- 
tial to define the purpose for which a body of 
information is to be used in order to define ap- 
propriate organizations for that information. 
As long as the purposes are unclear, or too 
varied, consistent organizations will be diffi- 
cult to achieve. The statement that a general 
ontology of real- world knowledge should sim- 
ply 'represent' that knowledge is undcrspec- 



ified. It does not provide sufficient guidance 
for finding useful organizations for that knowl- 
edge. Given that we need a general organiza- 
tion and that that organization will be deter- 
mined by purpose, we clearly need a very gen- 
eral (but still formally specifiable) task that 
requires particular inferences to be performed. 
If it were possible to find such a task, then 
it would be possible to use it as a guiding- 
methodology for constructing general organi- 
zations of knowledge. Precisely one such gen- 
eral task is, of course, the expression of knowl- 
edge in natural language: whatever the knowl- 
edge that is represented, i.e., whatever do- 
main and however general/specific, it should 
be possible to express that knowledge linguis- 
tically.^] One additional set of constraints that 
one can apply in the construction of organiza- 
tions of knowledge that attempt maximal ap- 
plicability across domains is then that offered 
by language. 

This must be specified further. For ex- 
ample, the acceptance of 'ways of talking' 
about categories as evidence for the existence 
of those categories in an ontology is a very old 
strategy (e.g., Aristotle) and is present even in 
CYC. This method of justification is, however, 
limited to seeing what one can say and still 
make sense about a category rather than any 
more technical analysis of linguistic proper- 
ties. The precise 'inferences' that are being re- 
lied upon to shape the organization are, there- 
fore, still not being given. Thus, there are ex- 
amples of ontologies that are constructed in 
nlp systems, where there is a specified rela- 
tionship between concepts and linguistic ex- 
pression, but the relationship is sufficiently 
non-general so as not to provide strong con- 
straints on ontology design. 

One such case is the ontology of kbmt 
projects Carbonell and Tomita, 1987[ such a s 



translator [Nircnburg and Raskin, 1987 



Work of this kind seeks a level of represen- 
tation that is minimally different across dis- 
tinct languages. Moreover, the value of or- 
ganizations of information that are relevant 
across distinct domains is clearly recognized 



3 This is overstated to the extent that some in- 
formation/knowledge is often maintained to be inex- 
pressible linguistically — even if this is so, it is still the 
case that by far the widest and most generally appli- 
cable form of expression that we know is language. In 
any case, whether or not there exists knowledge that 
is inexpressible linguistically will not affect the final 
outcome of the discussion below. 
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and re-usable ontology portions are actively 
sought. However, although the link to lan- 
guage ensured by the machine translation 
task increases the likelihood that this can be 
achieved on a larger scale, the re-usable por- 
tions of the ontologies proposed until now re- 
main small. This can in part be attributed 
to the fact that the appeal made to language 
as a constraining force on ontology design is 
undervalued. The 'external-to-language' at- 
titude towards ontological constructs assumed 
from Ai promises to capture abstract models 
of the world (or of conceptions of the world — 
a difference that is not criterial at this point) 
and its organization independent of particu- 
lar languages. This appears a tempting di- 
rection for achieving interlinguality. But then 
we find 'motivations' such as the following for 
the categories that are to be adopted within 
an interlingual ontology: 



'Russian has no word that corresponds 
exactly to the English word afford (as 
in / can't afford I or I can't afford to 
Y). In a multilingual processing envi- 
roment, there might be a concept cor- 
responding to a sense of the English 
word afford. A Russian sentence Ja 
ne mogu sebe etogo pozvolit' (I can't 
allow myself this), uttered in a con- 
text of acquisition . . . should involve the 
concept that represents afford. This 
means that if the units of the repre- 
sentation language are chosen so that 
they are based on Russian lexis, the 
meaning of afford will be missing. But 
this meaning seems sufficiently ba- 
sic to be included in an ontology.' 
Nirenburg and Levin, 1991] [bold: my 



emphasis] . 

It is clear that this kind of argumentation 
needs to be sharpened considerably; it is also 
clear that this can only be done when it has 
been established exactly what function the 
'ontology' is to serve. In general, the more 
detailed the linguistic constraints adopted on 
ontology design are, the more detailed and 
explicitly justifiable that ontology design be- 



4 This is also made problematic by the very mul- 
tilinguality of possible linguistic constraints inherent 
in machine translation system — without appropri- 
ate ways of achievin g linguistic generaliza tions across 
languages (cf., e.g., [Bateman et al, 1991 for discus- 
sion), the application ot linguistic constraints is very 
much more difficult. 



comes.0 However, the relationship between 
ontologies and nlp is interestingly reflexive.^] 
Ontologies appear necessary for the organiza- 
tion of knowledge appropriately for use by NLP 
systems, and simultaneously the explicitness 
of the necessary inferences that constitute an 
nlp system provide an until now unrivalled 
source of constraint for deciding on ontology 
designs. 

This connection is described well in the fol- 
lowing citation from Ewald Lang: 

'...the structure of language plays a 
dual role. It is, properly allocated to 
the parsing and generating components, 
a constitutive part of the object to be 
modeled (that is, the system which is to 
integrate linguistic and non- linguistic 
knowledge). But at the same time it 
is also part of the device by means 
of which this object is accessed, that 
is, the categorization of lexical items 
into nouns, verbs, etc., provides an ap- 
parently natural grid for establishing 
corresponding sorts of entities in the 
ontology, which, by definition, is to 
represent non-linguistic common sense 
knowledge. Given this, the risk of con- 
fusing linguistic and non-linguistic cat- 
egories is latent; moreover, it is practi- 
cally unavoidable as long as we are con- 
fined (or confine ourselves) to looking at 
common sense knowledge through the 
window of language only, i.e., without 
a chance to draw on independent ev- 
idence from non-linguistic (say, visual 
or kinasthetic) ways of accessing the 
structure and contents of common sense 



knowledge.' [Lang, 1991, p464] 



5 This 

also one result of an exten 



tologies reported upon in [3kuce and Monarch, 1990] 



was 

sive study of nronoscd ori- 



Although there has also been at least one example of 
development that has attempted movement in the op- 
posite direction. The abstraction structure of BBN's 
natural language and understanding project JANUS 
was redesigned away from a linguistically oriented de- 
scripti on in order t.n fin d a 'more general ontological 
style' [ Weischedel, 1989| , 200] that was not so strongly 



connected with the linguistic realization of the con- 
cepts defined. However, this very move was proba- 
bly one contributing factor to the less than successful 
outcome of the subsequent attempt to use the Long- 
man Dictionary of Contempory English as the ba- 
sis for defining a domain-independen t taxonomy for 
JANUS [ Reinhardt and Whipple, 1988[ . The most sig- 
nificant geTicraTIzatlonsT^ helped orga- 
nize the taxonomy for the purposes of natural lan- 
guage processing had probably already been lost. 
6 Or even circular: as I shall mention below. 
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Thus, while linguistic patterns are probably 
the richest source of organizational criteria 
that are available to ontology design, their use 
is certainly not unproblematic. Consequences 
of this can be seen in the fact that although 
the majority of recent and currently planned 
natural language processing systems recognize 
the necessity of some level of abstract 'seman- 
tic' organization similar to an ontology that 
classifies knowledge explicitly according to its 
possibility for linguistic expression^ very few 
have achieved ontologies of any size and mo- 
tivations for inclusion of particular concepts 
and distinctions in ontologies remain limited 
or underspecified. Thus, the decision to use 
linguistic evidence by itself is still, unless fur- 
ther restricted, underspecified and leaves open 
a range of positions. These give rise to differ- 
ing functionalities that the ontologies are to 
serve, which hence impacts on ontology de- 
sign. The positions and functionalities need 
to be characterized more precisely and this I 
attempt in the following section. 



3 Three kinds of ontologies 

Although I have concentrated until now on 
preliminaries to the first problem area men- 
tioned in the introduction — how knowledge 
of the world is to be represented — the ap- 
parent value of applying linguistic constraints 
to this task renders the second problem area 

— how that knowledge is related to language 

- crucial. If the ontology cannot be related 
to language in an explicit, formalized fash- 
ion, then the structures (and functions) of 
language will be prevented from having a di- 
rect constraining influence on what gets rep- 
resented in the ontology, what not, and how 
the entire ontology is to be organized. 



including, for example: 

enr.e Structure of XTR A : 

Dahlgren et ai. 1989[: [ Emele. 198 



the Functional Scn - 
[ AlW yer et al. 1 flsd] : 



lEmele et al. 



1990t ; 



f 



the POLYGLOSS 
certain of the domain 



project: 
and 

text structure objects of spokesman [ Mctccr, 1989j ]; 
TRANSLATOR: [ Nircnburg etal. 1987 : the Sem antic 
Relations of et irqtr a-d: fstcii ^er et al., 1987| ; the 
JANUS project: Weischedel, 1989| : and the oiitologi - 
cal types of the ACORD project: [VIoens et al., 198E]. 
Moreover, ontology-like organizations ol informations 
have also b 



en found useful f 



by. e.g.. [balder et al. 1989 



parsing application 



Chen and Cha, 198 



Hinrichs et al, 19871, Eajac, 19853^ There are no 
doubt many other places where this kind of construct 
now appears. 



There are at least two theoretically distinct 
standpoints from which this second problem 
area has been addressed in NLP systems. One 
possibility is to assume that real-world do- 
main knowledge is more or less directly linked 
to grammatical and lexical forms of expres- 
sion. The organization of the world knowledge 
ontology should then, ideally, also be support- 
ive of the use of that knowledge for linguistic 
expression or for interpreting linguistic dis- 
tinctions: the problem of relating knowledge 
to language is thus subordinated to the world 
knowledge ontology design. A second possibil- 
ity is to assume that the relationship between 
real-world domain knowledge and grammar 
and lexis is itself complexly structured. This 
structuring may lean for its organization to- 
wards the world knowledge ontology, in which 
case this would blend into the first possibility, 
or towards the grammar and lexicon, or al- 
ternatively could rely on its own principles of 
organization. Each of these variants has been 
adopted in some system where a concrete on- 
tology has been attempted. This gives rise to 
three distinct kinds of ontology that can be 
found in NLP work. An ontology can be 

• an abstract semantico-conceptual repre- 
sentation of real-world knowledge that 
also functions as a semantics for use of 
grammar and lexis — this type I will term 
a mixed ontology: O m ; 

• an abstract organization underlying our 
use of grammar and lexis that is separate 
from the conceptual, world knowlege on- 
tology, but which acts as an interface be- 
tween grammar and lexis and that ontol- 
ogy — this type I will term an interface 
ontology: O;; 

• an abstract organization of real-world 
knowledge (commonsense or otherwise) 
that is essentially non-linguistic — this 
type I will term a conceptual ontology: 
O c . 

The relationship involved here, their embed- 
ding in general architectures, and the sub- 
types of interface ontologies mentioned above 
are depicted graphically in Figure [l]. 

3.1 Conceptual ontologies 

Most of the Ai designed ontologies — includ- 
ing 
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Architecture 1 

CZlejricogrammai^ 




CC^!III"~~ semanti£D^conceptuaI>^ 


Ontology Type 1 



Architecture 2 

<d lexicogrammar^I^ 


f lexicogrammatically 


'CCT" semantics — 




■ [JlHiSHU 

Ontology J ^ , independent , 
Typo 2 1 1 conceptua n v 
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conceptual ~Z^>^ — 




Ontology Type 3 







Figure 1: Three kinds of ontology in NLP 



those o f CYC, tacitus | |Hobbs et al, 1987| , 
JANUS Weischedel, 1989 1, 'th e naive seman- 
tics' of [ Dahlgren et al, 1989 1, and even some 
aspects of the kbmt 

ontology, e.g., [Nirenburg and Raskin, 1987, 



Nircnburg and Levin, 1991 - are attempts 
to construct ontologies of the third type: pure 
maximally language independent ontologies 
reflecting the structure of the world. I have 
already discussed some of the difficulties of 
designing such ontologies without building up 
through an account of language. Psycholog- 
ical research might offer another source of 
evidence for such ontologies; as would de- 
tailed sociological work on the commonsense 
world. It is, however, unclear whether any 
such methodology will be able to avoid the 
relationship to language observed above and 
so I will now concentrate on ontologies which 
are at least intended to be related explicitly 
to language. 

3.2 Mixed ontologies 

An example of a mixed ontology — i.e., one 
where there is no extensive treatment of the 
relation between the world knowledge ontol- 
ogy and grammar and lexis maintained sepa- 



rate to the ontology itself — is the approach 
taken in the lilog natural language under- 
stand- 

ing project |Herzog and Rollinger, 1991 ; de- 



tails of the ontology are given in, for exam- 



ple, |Klose and von Luck, 1991 , Pirlcin, 1991 



Klosc et al, 1991], and details of the rela- 
tion between linguistic forms and concep- 
tual representations are given in, for exam- 
ple, iGust, 1991| , [Bosch, 199l[ . It may at first 
glance appear strange to classify lilog here, 
since the approach to the relation between 
linguistic form and world knowledge draws 
heavily on [ Bierwisch, 1982fl's theor y of se- 



mantics where, to cite [Gust, 1991 



| s state- 
semantic 



ment of Bierwisch's position: 
forms and conceptual structures belong to 
different and strictly discriminated levels'. 



Bosch, 1991] also makes it very clear that 
he holds this distinction to be crucial for 
making progress in semantics and knowledge 
representation. However, when the mod- 
elling of the approach is examined, we find 
that this distinction of levels comes under 
attack. For example, both semantic forms, 
which are derivable from the lexicon and from 
grammatical analysis, and conceptual forms 
are represented in a single language (the se- 
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mantic language being a subset of the con- 
ceptual language: [Bosch, 1991, p248]) and 
are freely combinable; moreover ]Gust, 1991| , 
pl33] maintains that: 'there are continuous 
variations between semantic forms and con- 
ceptual structures.' This gives rise to lexi- 
cal entries which directly contain categories of 
an ontology which also contains categories of 
real-world knowledge. The relation between 
conceptual knowledge and grammatical and 
lexical form is thus handled by logically ma- 
nipulating categories from a single ontology 
until categories are found that possess links to 
grammatical or lexical entries — this is pre- 
cisely the architecture consistent with a mixed 
ontology as shown in Figure |l|. 

An illustration of the nonseparation of of 
'linguistic' information and 'conceptual' infor- 
mation typical of ontologies of this type can be 



seen in the following taken from [Bosch, 1991 
p251]. In order to find the interpretation in 
context of the lexeme "school" as it is used, 
arguably differently, in examples such as: 

a. The school made a major donation. 

b. The school has a flat roof. 

A general 'lexical semantic entry' for the lex- 
eme is retrieved thus: 

SEM(" school") = AX [PURPOSE X 
W] 

where 
W = 

PROCESSES_OF_LEARNING_AND_TEACHING 

This is then interpreted by applying a given 
'contextualizing function' selected depending 
on the basis of the semantic interpretation of 
the predicate in the lexicogrammatical repre- 
sentation. Those for the example sentences 
would be: 

a. AX [INSTITUTION X & SEM X] 

b. AX [BUILDING X & SEM X] 

Combining the semantic entry and the con- 
textualizing function gives the required 'con- 
ceptual' concept that is the referent of the lex- 
eme in context — i.e., that "school" is inter- 
preted as either an institution or a building. 
All of the undefined predicates found in these 
logical expressions (e.g., institution, pur- 
pose, etc.) are sorts defined in the ontology. 
A direct link is therefore constructed from lex- 
icogrammatical information and chunks of in- 
formation appropriate for the conceptual level 



of organization. As we shall see in Section 4.2, 
this direct linking is a common property of 
NLP systems based on the common notion of 
'semantics' and arises out of a view of the lin- 
guistic system that collapses together several 
important distinctions. 

3.3 Interface ontologies 

The second and third types of ontology - 
the interface and conceptual types — usu- 
ally occur, at least theoretically, in the 
same architecture. Although it is also 
the case that some systems address them- 
selves to the organization of the interface on- 
tology without specifying how the concep- 
tual ontology will look. This latter po- 
sition is common for systems that are in- 
tended as general purpose nlp systems re- 
usable across different domains and appli- 
cations. Examples of such systems include 
both parsers a nd generators such as the Pen- 
man system [Mann and Matthicsscn, 198E , 



Penman Project, 1989 



and 



Mumble- 



86 [Meteer et al, 1987|. Here the problem of 



how to organize the interface with external 
applications, where those applications are not 
known in advance, has naturally focused at- 
tention on organizations of information appro- 
priate for interfacing. The approach to this 
developed within the Penman project in terms 
of the Upper Model has become more or less 
typical of how this is achieved — although to 
what extent this architecture has arisen inde- 
pendently across systems is unclear. The ini- 
tial formulation of the Upper Model was based 



on work by M.A.K. Halliday jHalliday, 1982 
William Mann and Christian Matthiesscn.^| 

The general statement of the interface prob- 
lem for NLP systems is that machine-internal 
information needs to be related to strategies 
for expressing that information in some natu- 



8 The development of the Upper Model ontology, 
from its inception as the Upper structure of the 
JANUS project of ISI and BBN, up to its inclu- 
sion as a standard component of the current Pen- 



ng- research reports: [ 


^arm. 1985, 


Mann et nl. 198f . 


VIoorc and Arcns, 198 


o|. Batcman 


et ai, 1990 


. The 



first detailed theoretical precursor to the ontology was 
set out in 1985 by Halliday and Matthiessen as a gen- 
eral organization for an experiential semantics: this 
was called the Bloomington Lattice. The sub- 
sequent development of the Upper Model has devi- 
ated somewhat from the purely linguistically moti- 
vated work; this will be discussed in more detail below. 
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ral language. This could be done in a domain- 
specific way by coding how the application do- 
main requires its information to appear. This 
is clearly problematic, however: it requires de- 
tailed knowledge on the part of the system 
builder both of how the generator controls 
its output forms and the kinds of information 
that the application domain contains. A more 
general solution to the problem of defining a 
mapping between knowledge and its linguis- 
tic expression is to provide a classification of 
any particular instances of facts, states of af- 
fairs, situations, etc. that occur in terms of a 
set of general objects and relations of specified 
types that behave systematically with respect 
to their possible linguistic realizations. This 
classification has itself many of the properties 
of an ontology, e.g., it is a hierachical organi- 
zation of sorts and roles — although by virtue 
of its motivation in linguistic realization, it 
must be seen as a strictly linguistically mo- 
tivated ontology. Examples here include as- 
pects of jMeteer, 1989| 's Text Structure Ob- 
jects in the spokesman text generator: 

' [i]t is important to remember that Text 
Structure objects reflect the semantic 
type of the expression of the informa- 
tion in an object, not some intrinsic 



type of the object itself. '[VIeteer, 1989, 
p21]; 

also the ontology of the ACORD system: 

' . . . the aim of the sort system is not 
to reflect the characteristics of real 
world objects and events referred to 
by linguistic expressions, but rather 
to systematize the ontological struc- 
ture evi denced by lingu istic expres- 



sions' [ fVloens et al, 1989j , pl78]; 

and, of course, the Upper Model of the Pen- 
man system that I will describe in more detail 
below. 

The position that such an interface ontology 
holds between surface details of a language 
and more abstract knowledge is, however, an 
uneasy one. As suggested above, it is possi- 
ble to differentiate among such ontologies ac- 
cording to whether they orientate themselves 
more towards less abstract or towards more 
abstract levels of representation. This brings 
with it two potential problems in ontology de- 
sign: 

• the ontology can be too shallow, in that 
it's categories are a too direct recoding of 



linguistic distinctions that do not achieve 
a qualitative increase in abstraction; 

• the ontology can be too deep, in that 
it is no longer possible to draw any for- 
mally specifiable connection between the 
constructs posited and the linguistic evi- 
dence taken as motivating them. 

Both extreme situations occur and both re- 
duce the value of the ontology as an effec- 
tive interface medium. The former problem 
will be accompanied by an increased difficulty 
in linking the ontology to information of par- 
ticular domains — regardless of whether this 
information is considered as a separate kind 
of information or as more specific details of 
the same kind of information; and the sec- 
ond problem will be accompanied both by an 
increased difficulty in linking with grammar 
and lexis and by the problems induced by 
poorer linguistic constraints mentioned above. 
The latter situation then often places a heav- 
ier reliance on 'internal' or formal constraints 
on organizatio n (cf., e.g., [ Wcischcdcl, 1989} 
Horacek, 1989|| and what |Lang, 199l| , p468] 
terms 'sortal' restictions) which, while impor- 
tant, do not provide sufficient grounds for de- 
ducing very much detailed actual content by 
themselves. 

3.3.1 Interface ontologies that are not 
abstract enough 

Interface ontologies exhibiting the former 
problem are very common and so it is worth- 
while giving a slightly more detailed exam- 
ple of the problems that arise. One such 
ontology is that constituted by the seman- 
tic relations used within the german compo- 



nent of the eurot ra project [pteiner, 1987 
Bteiner et al., 1987, 



Btciner and Rcuther, 1989 1 . These relations 



are a further development of earlier work by 
Fawcett — particularly his work on transitiv- 
ity in English (e.g., Fawcett, 1987] ]). Fawcett 
proposes a semantically motivated taxonomy 
of process types, analogously to the approach 



taken in [Halliday, 1985| but differing in the 
actual categories adopted. Each process type 
has some distinctive set of possible partici- 
pants — the approach thus differs from early 
accounts of semantic participants, such as 
Case Grammar [Fillmore, 1968 , where the 



participant relationships were often defined 



10 



separately from the processes in which they 
participate, and further articulates concep- 
tions of 'thematic' relations such as those 
found in Lexical-functional grammar (cf., 



| Hale and Keyser, 1986, 

Levin, 1987 ) and Government and Binding 
theory [Jackendoff, 1987 1 . The eurotra-d 
work has made refinements to the proposed 
taxonomy on the basis of multi-lingual ev- 
idence, particularly from German, so as to 
provide explicit syntactic tests for the assign- 
ment of processes to each of the various pro- 
cess types. It is then explicitly stated that 
the resulting process types described are no 
longer primarily semantic since their classifi- 
cation is based exclusively on differentiation 
by syntactic criteria. Therefore, although this 
has produced a framework within which pro- 
cesses can be classified according to the given 
taxonomy with a high degree of inter-coder 
consistency, which is an important criterion 
in large distributed projects such as EURO- 
tra, its effectiveness as a step towards a 
higher level of abstract information has been 
restricted. This can be seen in the follow- 
ing example of process classification given in 
ptcincr and Rcuthcr, 1989| | . For the clause 

That she gave no answer means that she 
agrees with the proposal. 

both subject and object are realized by that- 
clauses and the only possible classification ac- 
cording to the syntactic tests is then one of 
a mental process with two phenomena. How- 
ever, semantically the process also has strong 
elements of a relation between the proposi- 
tions involved. Similar examples in German 
are the following: the verb retten . . . vor. 

Dafi er gut schwimmen konnte, rettete 

ihn vor dem Ertrinken. 

That he could swim well saved him from 

drowning. 

again resembles a relational process but has to 
be assigned to mental according to the crite- 
ria formulated; the process 'reden' {to speak, 
talk), which would intuitively seem to be some 
kind of communication verb, cannot enter into 
constructions of the form: 

* Peter redet: Karl kommt morgen 
Peter speaks: Karl is coming tomorrow 

and so does not receive a communication verb 
classification: and the form: 



* Peter redet, dafi Karl kommt 
Peter speaks that Karl is coming 

cannot occur so it may not even receive a men- 
tal reading — the only acceptable forms pos- 
sible, e.g.: 

Peter redet Unsinn 
Peter speaks nonsense 
Peter redet mit Paul Peter speaks with Paul 

require an action classification, just as the cor- 
responding English processes would. These 
problems provide evidence that the syntactic 
tests need to be made more subtle or more 
elaborate in order to be able to reveal se- 
mantic distinctions more reliably. In addition, 
there is no account suggested of how this level 
of representation can link to more abstract 
levels of representation such as a conceptual 
ontology. 

A similar case of this probably contributes 
to some of the difficulties that arise with 
the use of 'Lexical Semantic Structures' (lss) 



and [ Jackendoff, 1983 's 'Lexical Conceptual 
Structures' (lcs) for translation — the for- 



described by [ 


Dorr, 1991|, 


Nirenburg and Levin, 1991 



Both 

structures are tightly bound to possible sur- 
face forms by formally specified linking rules 
(e.g., [ Levin, 1987 1 ) . These rules partition the 
LSS or LCS into classes reflecting the different 
realizational behaviour of their categories. Al- 
though it is also then sometimes possible to 
assign to these classes particular 'semantic' 
features this has still not yet been found to 
be sufficiently abstract to support a motivated 
construction of the corresponding conceptual 
ontology — as the example of the motivation 
for including the concept afford for Russian 
that I cited above shows. The final selection of 
conceptual ontological sorts in this case then 
shows similarities both with that described for 
LILOG: i.e., by applying a mixture of lexical, 
grammatical, and domain knowledge criteria, 
and with the pure Ai techniques of CYC and 
others. In the longer term, therefore, similar 
problems will occur. 

As a final example of the problems of lack 
of abstraction, I will mention some that have 
arisen in our development and use of the the 
Penman Upper Model. The Upper Model, 
for reasons that I will describe below, does 
succeed in being more abstract than the se- 
mantic relations adopted, for example, within 
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eurotra-d. The organization of an Upper 
Model achieves greater semantic coherence, 
grouping together distinctions that may be 
used by a variety of distinct grammatical re- 
sources in a grammar. For example, the re- 
lationships between process and participants 
may drive the organization of clauses, but 
they may equally drive the organization of 
head and modifiers in nominal groups. The 
nature of the process-participant relationships 
is not, arguably, altered by their realizational 
form. Upper Model generalizations might 
then express the commonality that unites the 
following area of variation: 

A shoots B 
B was shot at T 
the shooting of B by A 
A's shooting of B 

B's shooting 
the shooting at P 
the P shooting 
the T shooting 
etc. 

under a single specification:^ 
process: 

shoot (murderer: A, murdered:B,time:T,place:P). 

Categories in the Upper Model are then cap- 
turing generalizations which are not appropri- 
ately expressed within the grammar. A fur- 
ther example drawn from the 1989 version of 
the Upper Model is the possible grammati- 
cal realizations of the concept of generalized 
possession. This concept should be seen as 
being realized by possible selections from all 
the grammatical systems to do with 'posses- 
sion'. Thus the semantics of the following 
forms all make reference to this single Upper 
Model concept. 

the door's handle 
the handle of the door 
the handle that the door has 

the door handle 
the handle is part of the door 
etc. 

For a more extensive sets of examples 
of the lexico-grammatical variation that 
the Upper Model is intended to support, 



see [Bateman, 1989, Bateman, 1990a 



Although the actual representation used in the 
Upper Model reifies both predicates and the re- 

cates and their ar 



This means that the Upper Model does suc- 
ceed in achieving a sufficiently high degree 
of abstraction as to be useful as an inter- 
face medium. This increase in abstraction 
also makes the ontology better suited to link- 
ing with more abstract levels of information.^] 
The Penman system has been successfully in- 
terfaced with a number of applications - 
mostly expert systems, but also text planners 
— where domain knowledge is represented. It 
is then an example of an ontology that medi- 
ates the relationship between lexico-grammar 
and world knowledge without losing the neces- 
sary formal connection with the grammar and 
lexis. Moreover, it moves beyond problems 
such as that recognized for the EUROTRA-D 
classification of process types that^] 

'The classification system proposed by 
EUROTRA-D proceeds in a strictly syn- 
tactic way. . . . From the standpoint 
of generation this solution is problem- 
atic: it would be preferable to have 
a semantic classification that general- 
izes acro ss such surface sy ntactic sub- 
tleties." [|Heid et al, 198o( p!58] 



To the extent that it is successful, this is pre- 
cisely what the Upper Model provides. It 
achieves this by being based very closely not 
only on a particular, specified grammar — no 
concepts are admitted into the Upper Model, 
for example, unless they have a direct and 
specifiable consequence for the operation of the 
grammar — but also on a grammar which is it- 
self already more abstract than a constituency 
grammar. I shall describe this in more detail 
below. 

The Upper Model thus stands as a signif- 
icant step forward in dealing with the prob- 
lem of interfacing with a general NLP system. 
The Upper Model decomposes the mapping 
problem inherent in relating domain knowl- 
edge with its possibilities for linguistic expres- 
sion by establishing a level of linguistically 
motivated knowledge organization specifically 
constructed as a reponse to the task of con- 
straining linguistic realizations. While it may 



lations noicun 
p-uments: cf. [ 


y ner 
Mann 


ween prnnic 
et al, 1985 


Bateman et al 


, 1990 


1 . 



Hobbs et al, 1987 



10 So much so that it has sometimes been our ex- 
perience that the domain model of some application 
domains has been altered in the light of the consistent 
organization that the Upper Model brings to bear. 

11 This problem arose while attempting to interface 
the level of input specificatio n for an existi ng genera- 
tor of German (semsyn, cf.: [Rosner, 1988[) with the 
EUROTRA-D semantic relations. 
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not be reasonable to insist that application do- 
mains organize their knowledge in terms that 
respect linguistic realizations — as this may 
not provide suitable organizations for, e.g., 
domain-internal reasoning — we have found 
that it is reasonable, indeed essential, that do- 
main knowledge be so organized if it is also 
to support expression in natural language re- 
lying on general natural language processing 
capabilities. 

The general types constructed within the 
Upper Model necessarily respect generaliza- 
tions concerning how distinct semantic types 
can be realized. We then achieve the neces- 
sary link between particular domain knowl- 
edge and the Upper model by having an ap- 
plication classify its knowledge organization 
in terms of the general semantic categories 
that the Upper Model provides. This should 
not require any expertise in grammar or in 
the mapping between Upper Model and gram- 
mar. An application needs only to concern it- 
self with the 'meaning' of its own knowledge, 
and not with fine details of linguistic form. 
This classification functions solely as an in- 
terface between domain knowledge and Up- 
per Model; it does not interfere with domain- 
internal organization. The text generation 
system is then responsible for realizing the 
semantic types of the level of meaning with 
appropriate grammatical forms. ^| Further, 
when this classification has been established 
for a given application, application concepts 
can be used freely in input specifications since 
their possibilities for linguistic realization are 
then known. Interfacing with such a system 
is thus radically simplified on two counts: 

• much of the information specific to lan- 
guage processing is factored out of the 
input specifications required and into the 
relationship between Upper Model and 
linguistic resources; 

• the need for domain-specific linguistic 
processing rules is greatly reduced since 
the Upper Model provides a domain- 
independent, general and reusable con- 
ceptual organization that may be used 
to classify all domain-specific knowledge 



when linguistic processing is to be per- 
formed. 

An example of the simplification that use 
of the Upper Model offers for a text gener- 
ation system interface language can be seen 
by contrasting the input specification re- 
quired for generators that work with realiza- 
tion classes that are less abstract than those 
of the Upper Model — such as, e.g., mumble- 
86 [Meteer et at, 1987 1 , or unification-based 
frameworks, 

such as McKeown and Paris, 198"?] and the 
Lexical Functional Grammar (LFG) approach 
of |Momma and Dorre, 1987 1 — with the in- 
put requTred~^r~F^rami^m^j Figure |^ shows 
corresponding inputs for the generation of 
the simple clause: Fluffy is chasing little 
mice. The appropriate classification of do- 
main knowledge concepts such as chase, dog, 
mouse, and little in terms of the general se- 
mantic types of the Upper Model (in this case, 
directed- action, object, object, and size respec- 
tively — cf . | Bateman et aL, 199C ] ) automati- 
cally provides information about syntactic re- 
alization that needs to be explicitly stated in 
the MUMBLE- 
86 input (e.g., S-V-0_two-explicit-args , 
np-common-noun , 
restrictive-modifier, 
adjective). Thus, for example, the classi- 
fication of a concept mouse as an object in the 
Upper Model is sufficient for the grammar to 
consider a realization such as, in mumble- 
86 terms, a general-np with a particular 
np-common-noun and accessories of gender 
neuter. Similarly, the classification of chase 
as a directed- action opens up linguistic re- 
alization possibilities including clauses with 
a certain class of transitive verbs and char- 
acteristic possibilities for participants, cor- 
responding nominalizations, etc. Such low- 
level syntactic information is redundant for 
the penman input. Similar, illustrative in- 



12 This is handled in the 

PENMAN system by the grammar's inquiry semantics, 
which has been described and illustrated extensively 
else wher e (cf., [Penman Project, 1989 1 ) and see Sec- 
tion |4,2| below. 



13 Note that this is not intended to single out these 
approaches at all, the problem is quite general and oc- 
curs whenever there is no ontology available for orga- 
nizing information at a more abstract level than that 
imposed by the grammar. Further, as already noted, 
most current NLP developments are moving in a direc- 
tion analogous to that taken in our work on the Upper 
Model. 

14 Moreover, when additional information is re- 
quired, that information is supplied in semantic terms 
rather than in terms of morphosyntactic labeling such 
as : number plural — in this case this is represented 
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puts forms can easily be imagined for other 
types of syntactically oriented grammar and 
lexis components. 

The further domain-independence of the 
Upper Model is shown in the following ex- 
ample of text generation control. Consider 
two rather different domains: a navy database 
of ships and an expert system for digital 
circuit diagnosis .0 The navy data base 
contains information concerning ships, sub- 
marines, ports, geographical regions, etc. and 
the kinds of activities that ships, submarines, 
etc. can take part in. The digital circuit 
diagnosis expert system contains information 
about subcomponents of digital circuits, the 
kinds of connections between those subcom- 
ponents, their possible functions, etc. A typi- 
cal sentence from each domain might be: 

circuit domain: The faulty system is 
connected to the input. 
navy domain: The ship which was 
inoperative is sailing to Sasebo. 

The input specifications for both of these sen- 
tences are shown in Figure ||. These spec- 
ifications freely intermix Upper Model roles 
and concepts (e.g., domain, range, property- 
ascription) and the respective domain roles 
and concepts (e.g., system, faulty, input, des- 
tination, sail, ship, inoperative). Both forms 
are rendered interpretable by the subordi- 
nation of the domain concepts to the sin- 
gle generalized hierarchy of the Upper Model. 
This is illustrated graphically in Figure || 
Here we see the single hierarchy of the Up- 
per Model being used to subordinate concepts 
from the two domains. The domain concept 
system, for example, is subordinated to the 
Upper Model concept object, domain concept 
inoperative to Upper Model concept qual- 
ity, etc. By virtue of these subordinations, 
the grammar and semantics of the generator 
can interpret the input specifications in order 
to produce appropriate linguistic realizations: 
the Upper Model concept object licenses a par- 
ticular set of realizations, as do the concepts 



in inquiry semantics by the inquiry response pairs 
{:multiplicity-q multiple} and {:singularity-q nonsin- 
gular}. This is also the case for 'tense' but I have 
abbreviated the semantic specification here. For de- 



quality, material-process, etc.p| 

Despite the progress that has been made 
with the Upper Model as a potential interface 
ontology, it is still the case that the mappings 
between grammatical forms and the categories 
of the Upper Model ontology are not yet rich 
enough to ensure entirely appropriate seman- 
tic classifications — entirely anologously to 
the case with the explicitly syntactically ori- 
ented categories of the eurotra-d semantic 
relations. In an attempt to make the defini- 
tions of the Upper Model concepts more acces- 
sible to users of the Penman system, these def- 
initions have been pushed towards an intepre- 
tation of the Upper Model as predominantly a 
hierarchy of generalizations about possible lin- 
guistic realizations in English. This approach 
permits a very straightforward control of the 
grammar but compromises some of the seman- 
tic integrity. Some simple examples of this 
may be seen in the following. 

In the then current version of the gram- 
mar, the following clause, which is an ex- 
ample that arose during development of Jo- 
hanna Moore's Program Enhancer Advisor 
(pea) system ]Moore, 1989| :p] 



X is defined as Y 

had to be constructed from a process define 
and an adjunct of 'role-playing' to produce 
the prepositional phrase as Y. This contrasts 
with a more semantically oriented discrimina- 
tion of process types which could take, per- 
haps, a process of 'defining' with three neces- 
sary participants, a definer, a defined, and a 
definition, and state how these are realized di- 
rectly. In the realization class view as we have 
it now, the process of defining has to be ex- 
plicitly decomposed semantically at the level 
of the Upper Model into a process and a rela- 
tionship of role-playing. This is not intuitively 
obvious: indeed, a user has to know how the 
grammar generates as-prepositional phrases in 
order to arrive at the 'correct' Upper Model 
classification in order to be able to generate 
the clause. This is dangerously close to the 
amount of low- level syntactic detail that needs 
to be provided for a Functional Unification 
Grammar or Mumble-86. 



scriptions of all these d istinctions in detail, se e the 
] Pcnman Project, 198S| ]. 



PENMAN documentation 

15 These are, in fact, two domains with which we 
have had experience generating texts using the Upper 
Model. 



le For further discussion of this simplification in the 
semantic input, spor.ifira.tinn for the sentence genera- 
tor, see [Bateman, 1990b . 

17 A11 the PEA examples were provided in work by 
Johanna Moore and Richard Whitney. 
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(general-clause 

: head (CHASES/S-V-0_two-explicit-args 
(general-np 

:head (np-proper-name "Fluffy") 
accessories (: number singular 
: gender masculine 
: person third 

: determiner-policy no-determiner) ) 

(general-np 

:head (np-common-noun "mouse") 
accessories (: number plural 

: gender neuter 

: person third 

: determiner-policy initially indefinite) 
: further-specifications 

( ( : attachment-function restrictive-modifier 
: specif ication (predication-to-be *self* 

(adjective "little"))) )) ) 
accessories ( :tense-modal present progressive 
: unmarked) ) 

Input to mumble-86 for t he clause: Fluffy is chasing little mice 



from: [Meteer et al, 1987] 



(e / chase 

: actor (e / dog :name Fluffy) 
:actee (m / mouse 

: size-ascription (s / little) 

:multiplicity-q multiple : singularity-q nonsingular) 
: tense present -progressive) 

Corresponding input to penman 
Figure 2: Comparison of input requirements for mumble-86 and penman 
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(vl / connects 

: domain (v2 / system 

: relations (v3 / property-ascription 
: domain v2 

: range (v4 / faulty))) 

: range (v5 / input) 
: tense present) 

Input for digital circuit example sentence: 
The faulty system is connected to the input 



(vl / sail 

: actor (v2 / ship 

: relations (v3 / property-ascription 
: domain v2 

: range (v4 / inoperative) 

: tense past) 
: destination (sasebo / port) 
: tense present -progressive) 

Input for navy example sentence: 
The ship which was inoperative is sailing to Sasebo 
Figure 3: Input specifications from navy and digital circuit domains 



This is not an isolated case. Other problem- 
atic assignments in the PEA domain include: 

• The process call, as in "The boy is called 
John". Presently call is classified as a 
dispositive-material-action from UM-89, 
boy becomes the actee, and the name, 
'John', becomes a recipient. No actor 
is specified and so a passive construction 
appears (due to a then current shortcut 
defined for the textual reasoning that the 
grammar initiates for selection of active- 
passive clauses). 

• The process generalize to, as in "The re- 
sult can be generalized to other cases". 
Here generalize is again a straightforward 
nondirected- action and to other cases is 
specified as a destination spatio-temporal 
circumstance in UM-89 in order to gen- 
erate the preposition. 

In all of these cases, the role assignments 
are only being used in order to achieve the 
required syntactic pattern given by the par- 
ticular state of the grammar of the Penman 
system: the Nigel grammar of English. In the 
first example, the model for the clause being 



used is that of give since this class of verbs 
is bitransitive; in the second, the technique 
adopted is as with the case of define as above, 
where a circumstantial role is selected purely 
in order to guarantee the desired preposition. 
Although in these cases it is reasonably clear 
that both grammar and Upper Model would 
need to be extended to include the desired 
process types, in general the theoretical sta- 
tus of using arbitrary assignments of concepts 
to the Upper Model and selections of roles to 
be expressed has not been made sufficiently 
clear. This technique is (or rather should) 
only be employed when it is not possible to 
extend the grammar and ontology appropri- 
ately: only when the grammar has to be taken 
as 'fixed', e.g., because it is being applied by 
a user that does not have access to the inter- 
nal organization of the grammar, is this kind 
of strategy defensible. As a general technique, 
the strategy has to be strongly rejected on the- 
oretical grounds. However, note that without 
a commitment to semantic coherence, there 
is little reason not to use the Upper Model 
in this way; we have already seen the similar 
situation in the use of the eurotra-d sys- 
tem of semantic relations where commitment 
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Figure 4: Upper Model organization reuse with differing domains 
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to semantic coherence has been explicitly re- 
jected in favor of more readily operationalize- 
able grammatical critera. 

Another set of related problems arises when 
semantically similar processes have different 
syntactic realization. Consider, for example, 
the two clause types: 

X is like Y 
X resembles Y 

Although a user might wish to place these 
similarly in the Upper Model, grammatically 
they are rather different. The former requires 
the grammatical features: {circumstantial- 
attribute, manner-participant, be- intensive}; 
the latter has features: {circumstantial- 
ascription, circumstantial- process}. In the 
present Nigel grammar, these are dis- 
tinguished by the inquiry circumstantial- 
ascription-q, which would need to examine the 
Upper Model. Therefore, in order to obtain 
the differing syntactic structures a further dis- 
tinction would need to be set up at the Upper 
Model level. 

The realization class view therefore makes 
it difficult for users to formulate their input 
specification to the system unless they know 
precisely the form of linguistic expression that 
they require. Since the realizational link be- 
tween Upper Model categories and Nigel has 
been made so tight for the very purposes 
of achieving readily describable criteria, it is 
sometimes (and increasingly once more users 
attempt more varied modelling) necessary to 
subordinate a concept in a counter-intuitive 
position simply in order for the language re- 
quired to result. This certainly undermines 
the semantic integrity of the Upper Model as 
an interface ontology and moves the entire 
classification towards a less abstract level of 
information. It needs to be remembered, how- 
ever, that only when the grammar is fixed, is 
a specific, determinate Upper Model required 
— furthermore, that Upper Model is even par- 
tially determined by the particular grammar 
that is specified. 

Finally, one further problem with the in- 
terface ontology instantiated by the Penman 
Upper Model lies precisely in the simplicity of 
the relationship constructed betweeen domain 
model and Upper Model. We have seen that 
this is achieved by literally classifying (in the 
formal sense of adding into the subsumption 
lattice) domain model concepts in terms of the 



categories from the Upper Model. Following 
this operation, the Upper Model and the do- 
main model form a single inheritance hierar- 
chy and the domain concepts directly inherit 
the possibilities for surface realization defined 
for the Upper Model concepts. This opera- 
tion is currently performed only once for each 
domain and, while simplifying input expres- 
sions, it means that the relationship between 
domain and Upper Model is not being han- 
dled particularly flexibly. In fact, once the 
classification is complete, the complete ontol- 
ogy can be interpreted as having collapsed 
into a mixed ontology of the type described 
for LILOG: both particular domain concepts 
and general linguistically motivated concepts 
occur in the same subsumption lattice. This 
treatment of the relationship between a po- 
tential conceptual ontology, containing detail 
knowledge of a domain, and the interface on- 
tology, containing a semantic classification of 
possibilities for linguistic expression, needs to 
be made considerably more flexible to avoid 
the problems of mixed ontologies described 
both above and below. 



3.3.2 Interface ontologies that are too 
abstract 

Interface ontologies exhibiting the problem of 
being too abstract are more commonly found 
in small scale systems: the problem of not be- 
ing able to specify the mapping down to gram- 
mar and lexis in a convenient and expandable 
form often prevents large-scale development 
from getting very far. Such projects (e.g., 
POLYGLOSS, ACORD and many others) begin 
by adopting classes of categories developed, 
for example, in analytical philosophy or nat- 
ural language semantics — such as the event 
types of [Vcndlcr, 1967], temporal categories 



such as those of [Moens and Steedman, 1988 



the semantico-'conceptual' predicates pro- 



posed by [Jackcndoff, 1983, Jackendoff, 1990 



event structures of Pustejovsky, 1988 1, and 
many others. As long as restricted grammat- 
ical possibilities are entertained, for example 
to enable research on particular focused areas 
of semantics-syntax, then such ontologies are 
adequate — even useful, since the focusing al- 
lows greater depth in the semantic account to 
be achieved. It should also be the case, how- 
ever, that this work then feeds back into more 
general and broader ontology work, and this 
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happens much too rarely. It is also sometimes 
unclear what the relationship of these ontolo- 
gies would be to a more abstract conceptual 
ontology — this may be expressed formally, 
for example, in terms of a model-semantic the- 
ory but the details are often left for future 
work. 

3.3.3 Brief discussion 

It is clear that ontologies of the interface 
type that are more closely bound to lan- 
guage are nevertheless most useful for nlp 
systems that want to deal with a wider va- 
riety of actual language phenomena. The in- 
crease in abstraction may not be so very great 
in comparison to a desired conceptual ontol- 
ogy, but it is nevertheless better than work- 
ing with grammar and lexis directly. Such 
work is also much more likely to be sta- 
ble in the face of changing theoretical posi- 
tions and more justifiable with respect to ac- 
tual linguistic data. It is, then, natural that 
one further type of nlp projects attacking 
the problem of large-scale ontology construc- 
tion is that of 'dictionary'-oriented projects, 
such as EDR [ Matsukawa and Yokota, 199l| 
and ACQUILEX |Calzolari, 199lf] . The edr 
project aims at producing a 'concept dictio- 
nary' containing 400,000 'word senses' for En- 
glish and Japanese, and ACQUILEX is con- 
cerned with producing a re-usable 'lexical 
knowledge base' that classifies entries accord- 
ing to taxonomies of semantic categories and 
relations between those categories. Both 
projects have constructed sizeable semantic 
taxonomies relying strongly on differences in 
lexico-grammatical realization for the cate- 
gories adopted. The taxonomy organization 
and categories found in ACQUILEX have simi- 
larities to the view of lexical semantics pro- 
posed by [Pustejovsky, 1991 where, again, 



oppositions in linguistic behavior are an es- 
sential motivating criterion. Another large 
project partly leading up to this work, and 
now related to the kbmt work mentioned 
above, was the MIT Lexicon Project where 
extensive classification of lexemes was under- 
taken on the basis of the differing grammatical 
patterns that the lexemes may enter into. 

Although the construction of large knowl- 
edge bases at this level of abstraction is bound 
to offer a definite improvement in our abil- 
ity to rely on linguistic motivations in future 



ontology design, their availability will not of 
itself bring about that design. It is still neces- 
sary to consider methodologies for using such 
information so that appropriate ontologies for 
general nlp use can be constructed. There- 
fore, in the next section I will relate the kinds 
of ontologies that we have seen in this section 
to compatible linguistic theories. Without a 
broader view of what is being done linguisti- 
cally when categories for a particular kind of 
ontology are proposed, I believe it is unlikely 
that progress will be made. As long as the 
categories developed are sufficiently close to 
the surface details of language to remain ob- 
jectively verifiable, i.e., remain in the realms 
of syntax and lexico-grammatically oriented 
interface ontologies, useful classifications can 
be constructed. For more abstract levels, how- 
ever, the support of theory become crucial for 
defining methodologies, questions, and possi- 
ble solutions. 



4 Linguistic support (or 
otherwise) for the ontol- 
ogy types 

In this section, I will follow the ordering of the 
discussion of ontology types of the previous 
section: i.e., first linguistic theories compat- 
ible with the design of mixed ontologies will 
be mentioned, followed by the kind of linguis- 
tic theory that is more supportive of distinct 
interface and conceptual ontologies. I will not 
raise the issue here of the relationship between 
'conceptual ontologies' and possible linguistic 
theories, since one of the defining phrases that 
is often used about this level of abstraction is 
its very extra-linguisticaess. This does, how- 
ever, depend on the view of the linguistic sys- 
tem that is adopted and I will mention some- 
thing about this later. Finally in this section, 
I discuss some disadvantages of the former ap- 
proaches when considered as a methodology 
for developing the kinds of resources neces- 
sary for NLP systems. 

Before beginning the discussion, I should 
however briefly note the motivation for an 
exclusion of forms of semantics such as sit- 
uation semantics, model-theoretic semantics 
of various kinds, etc. below. Such accounts 
are not immediately relevant to the discus- 
sion at hand precisely because they have not 
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been concerned with the construction of rep- 
resentations that are directly supportive of 
ontologies. That is, regardless of whether 
the formal account of semantics proposed 
in some particular framework contains sets 
of predicates that are of mixed ontological 
status, or are purely conceptual, or purely 
(linguistically) semantic, we find one crucial 
component of ontological engineering miss- 
ing. Those categories are not typically built 
up into subsumption lattices of sorts shar- 
ing various general properties of use for fur- 
ther domain classification. It is clear that 
many of these theoretical approaches could 
easily move in this direction, and with the 
increased use of sorts in linguistic theory at 
all levels of descriptio n some first steps have 
been taken (e.g., fejag and Pollard, 1991 , 
p78], |Ncrbonnc, 1992|). However, as point ed 



out by [Onyshvckych and Nirenburg, 199 1 1 



'The crucial point is that in order to 
have an explanatory power, the atoms 
of [a] meaning representation language 
must be interpreted in terms of an in- 
dependently motivated model of the 
world. Moreover, if any realistic ex- 
periments have to be performed with 
such an nlp system, this world model 
(sometimes called an ontology) must be 
actually built, not only defined alge- 
braically.' 

Therefore, until the problem of ontology con- 
struction on a realistic scale itself becomes an 
issue for an account, that account remains of 
less central concern for the current discussion. 



4.1 Mixed ontologies and lin- 
guistic theory 

The closest linguistic approaches to support 
mixed ontology design such as that found in 
lilog are, perhaps surprisingly, those com- 
patible with the work of [Jackcndoff, 1983, 
Jackendoff, 1990 1 . Jackendoff adopts the po- 
sition that the semantic level of representa- 
tion with which he is concerned is also con- 
ceptual, i.e., common to modalities such as 
language and vision Jackendoff, 1983 1. As 



pointed out by [ Herweg, 1991[ , approaches 
that directly link syntax with conceptual in- 
terpretation now occupy a rather standard po- 
sition in mainstream linguistics and so there 
are many approaches that could be described. 



That of Jackendoff is probably one of the 
most developed and well known in this di- 
rection, although there are also similarities 
to be drawn with work in Cognitive Linguis- 
tics [Langacker, 1987, Talmy, 1987| and direc- 
tions such as that of [Wierzbicka, 198S]. All 



of these approaches share an orientation to 
language as an instrument for revealing facets 
of conceptual organization. This is stated 
most clearly by Jackendoff in terms of what 
he terms the Grammatical Constraint: 



'. . . it would be perverse not to take as 
a working assumption that language is 
a relatively efficient and accurate en- 
coding of the information it conveys. 
To give up this assumption is to refuse 
to look for systematicity in the rela- 
tionship between syntax and semantics. 
A theory's deviations from efficient en- 
coding must be rigorously justified, for 
what appears to be an irregular rela- 
tionship between syntax and semantics 
may turn out merely t o be a bad theory 
of one or the other.' [Jackendoff, 1983, 
p404] 



Given his equation of semantic structure and 
conceptual structure, this becomes largely 
equivalent to statements such as the follow- 
ing describing the basic claim of of cognitive 
linguistics: 

"... across the spectrum of languages, 
the grammatical elements that are en- 
countered, taken together, specify a 
crucial set of concepts. This set is 
highly restricted: only certain concepts 
appear in it, and not others. . . [This] 
set of grammatically specified notions 
collectively constitutes the fundamental 
conceptual structuring system of lan- 
guage. That is, this cross-linguistically 
select set of grammatically specified 
concepts provides the basic schematic 
framework for conceptual organization 
within t he cognitive domain of lan- 
guage." |Talmy, 1987|, pl65/6] 



This position also appears in the approach 
of Pustejovsky to the relation between lex- 
emes and their interpretation in context; as 
he writes, 

'The meaning of words should somehow 
reflect the deeper, conceptual struc- 
tures in the system and the domain it 
operates in. This is tantamount to stat- 
ing that the semantics of natural lan- 
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guage should be the image of nonlin 
guistic conceptual p rinciples (whatever 
their structure).' Pustejovsky, 1991 
p410] 

These approaches are all described by the first 
architecture depicted in the diagram of Fig- 
ure |l|. Each suggests that there is a portion 
of the conceptual ontology that has a direct 
linguistic connection and that that portion 
should have just the same kind of organization 
as the rest of the conceptual ontology. A spec- 
ification of the semantics of some expression 
is simultaneously a (possibly partial) specifi- 
cation of a conceptual specification. Again, 
this state of affairs receives a very explicit de- 
scription from Jackendoff: 

'This account of the syntax-semantics 
correspondence gives a principled ac- 
count of the level of "argument struc- 
ture" found in various versions of GB 
and LFG ... - a level of linguistic rep- 
resentation that lists the arguments of 
a verb, with or without their 0-roles. 
Such a list can now be simply con- 
structed from the set of indices in the 
conceptual structure of the verb, and 
there is one index per syntactically ex- 
pressed argument... In short, "argu- 
ment structure" can be thought of as 
an abbreviation for the part of concep- 
tual stru cture that is "vi sible" to the 
syntax.' |jackendoff, 1984 P404/5] 

By virtue of the Grammatical Constraint, 
therefore, Jackendoff adopts a very close bind- 
ing of linguistic analysis and categories at 
his semantico-conceptual level of represen- 
tation: available linguistic realizations and 
patternings lead directly to the positing of 
corresponding categories and relationships 
at the level of semantic/conceptual struc- 
ture. In Jackendoff's case, the linguistic ev- 
idence admitted is organized in terms of X- 
theory Chomsky, 1980| , Jackendoff, 1977] and 
so close correspondences appear between cat- 
egories of this theory and categories of the 
semantic/conceptual structure. In particular, 
he states that: 

1. "... every major phrasal constituent in 
the syntax of a sentence corresponds to 
a conceptual constituent that belongs to 
one of the major ontological categories." 

2. "... the lexical head X of a major phrasal 
constituent corresponds to a function in 



conceptual structure — a chunk of the 
inner code with zero or more argument 
places that must be filled in order to form 
a complete conceptual constituent. The 
argument places are filled by the readings 
of the major phrasal constituents strictly 
subcategorized by X." [Jackendoff, 1983, 
P 67] 

Thus, he suggests the following approxima- 
tion to conceptual structure for the sen- 
tence The man put the book on the table 
[ Jackendoff, 1983j p68]. 



EVENT 

THING 
THE MAN 
PLACE 

THING 



put( 

P 

on( 



THING 
THE BOOK 



THE TABLE 



This structure, if we ignore the textual in- 
formation represented abbreviated here with 
the, shows striking similarities with the in- 
put specification described earlier for Penman 
(cf. Figures |^ and ||). The structure may 
be glossed as stating that a predicate put of 
type event holds over three arguments: the 
first two are of type thing, the latter is an 
077,-relation of type place. Each of the pred- 
icates are taken to be defined as semantico- 
conceptual categories motivated primarily by 
linguistic patterning. Further examples of the 
motivation of semantico-conceptual categories 
from linguistic evidence is the following list of 
example categories offered by Jackendoff: 



Interrogative probe 

a. What did you buy? 

b. Where is my coat? 

c. Where did they go? 

d. What did you do? 

e. What happened next? 

f. How did you cook the eggs? 

g. How long was the fish? 



supports category: 

[thing] 

[place] 

[direction] 

[action] 

[event] 

[manner] 

amountI 



Subsequently, further categories of differentia- 
tions are made working from intuitions on the 
meanings of sentences and their constituents 
supported by example sentences. Moreoever, 
analogously to the perceived relationship be- 
tween syntactic structures and rules for their 
well-formedness, Jackendoff takes the position 
that the inter-relationships between the se- 
mantic/conceptual categories will also be ex- 
pressed in terms of well-formedness rules. An 
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example for the category [path] is as follows 
[ Jackendoff, 1983| pl66]: 



PATH 



Path 



TO 

FROM 
TOWARD 
AWAY-FROM 
VIA 



[Thing y] 
[place V ] 



The combination of a number of rules such 
as these begins to define a hierarchy of inter- 
related categories analogous to the standard 
hierarchical organization that I have assumed 
appropriate for ontology construction. 

A comparison of Jackendoff's semantico- 
conceptual categories with, for example, the 
superficially very different categories arising 
from cognitive linguistics is very illuminating 
concerning the role that motivations from lan- 
guage can play for ontology construction. The 
general methodology of proponents of cogni- 
tive linguistics is to examine 'grammatical' el- 
ements — however these come to be defined — 
in order to uncover the conceptual organiza- 
tion they presuppose. For example, Talmy of- 
fers the following break down of the this/that 
distinction in English. 

'A closed-class element of this type 
specifies the location of an indicated ob- 
ject as being, in effect, on the speaker- 
side or the non-speaker-side of a con- 
ceptual partition drawn through space 
(or ti me or other qualitative dimen- 
sion).' fTalmy, 1987| p!68] 



This is summarized as: 

• a 'partition' that divides a space into 're- 
gions '/'sides' 

• the 'locatedness' (a particular relation) of 
a 'point' (or object idealizable as a point) 
'within' a region 

• (a side that is the) 'same as' or 'different 
from' 

• a 'currently indicated' object and a 'cur- 
rently communicating' entity. 

By sampling across a wide range of languages 
the Cognitive Grammarian compiles a list of 
such distinctions and attempts to provide in- 
ternal organization and structure rooted in a 
presumed linguistically relevant area of con- 
ceptual organization. The flavor of this orga- 
nization can be seen in the following examples 
of proposed categories from Talmy. 



Dimension "The category of 'dimension' has 
two principal member notions, 'space' and 
'time'. The kind of entity that exists in space 
is — in respectively continuous or discrete 
form — 'matter' or 'objects'. The kind of 
entity existing in time is, correspon dingly, 
'action' or 'events'..." fTalmy, 1987| , pl74]. 
This is schematized as: 



dimension 


continuous 


discrete 


space : 
time : 


matter 
action 


objects 
events 



Plexity 'Plexity' is a generalization of notions 
such as singular and plural to cover actions 
also. For example: 

matter action 

a. uniplex A bird flew in. He sighed (once). 

b. multiplex Birds flew in. He kept sighing. 

Boundedness 'Boundedness' is a generalization 
of notions such as mass and count with re- 
spect to nouns to include again actions in ad- 
dition to objects. This Talmy relates to im- 
perfective and perfective and similar terms in 
the treatment of verbs. Essentially, "[w]hen 
a quantity is specified as 'unbounded', it is 
conceived as continuing on indefinitely with 
no necessary characteristic of finiteness in- 
trinsic to it. When a quantity is specified as 
'bounded', it is conceived to be demarcated 



as an individual unit entity." ([Talmy, 1987, 
pl78]). Similar, far more formal, expressions 
of this idea can now be found in a number of 



approaches (e.g. [Krifka, 1989]) 



Dividedness "A quantity is 'discrete' (or 
'particulate') if it is conceptualized as 
having breaks, or interruptions, through 
its composition. Otherwise, the quan- 
tity is conc eptualized as 'continuous'." 
Talmy, 19871 pl80] 



These categories hold of a given 'quantity' 
simultaneously and so classify that quantity 
along the dimensions described. Moreover, 
different linguistic consequences are intended 
to follow from each distinction. Although 
there are many interesting distinctions sug- 
gested which could help enrich proposed on- 
tologies along a number of dimensions, the 
lack of an accepted, detailed grammatical 
framework nevertheless limits the generaliza- 
tions that can be found. Langacker claims 
that: 

". . . basic grammatical categories such 
as noun, verb, adjective, and ad- 
verb are semantically definable. The 
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entities referred to as nouns, verbs, etc. 
are symbolic units, each with a seman- 
tic and a phonological pole, but it is 
the former that determines the catego- 
rization. All members of a given class 
share fundamental semantic properties, 
and their semantic poles thus instanti- 
ate a single abstract schema subject to 
reasonably explicit characterization. A 
noun, for example, is a symbolic struc- 
ture whose semantic pole instantiates 
the schema [thing] ... In a similar fash- 
ion, a verb is said to designate a pro- 
cess, whereas adjectives and adverbs 
designate diff erent kinds of at emporal 
relations." [Langacker, 1987, p!89] 



Although with the proposed conceptual cat- 
egories restricted in this way to follow from 
grammatical categories that are so directly 
'observable', i.e., often inflectional and word- 
based such as singular and plural, mass and 
count, nouns and verbs, etc., one would not 
expect a particularly rich ontology, in fact, a 
large number of finely differentiated categories 
are set up — primarily on the basis of con- 
trastive examples that do not rely on detailed 
syntactic analysis. This shows conclusively 
the value of examining a very wide range of 
natural occuring ex amples, in con trast to the 
oft criticised (e.g., [ Rohrcr, 1986| ), but nev- 
ertheless still prominent, tendency in main- 
stream linguistics to study constructed ex- 
amples in areas that illuminate the currently 
fashionable linguistic phenomena. Neverthe- 
less, the lack of a formally specifiable mapping 
between the categories proposed and linguistic 
realization renders the consequences of estab- 
lishing any particular set of categories almost 
impossible to investigate and this is certainly 
less of a problem in a contrasting account such 
as that of Jackendoff where the relation to a 
detailed account of grammar and lexis is al- 
ways clear. The value noted above of being 
able to test out and justify proposed categories 
for ontologies formally applies here strongly. 
Jackendoff is able, therefore, even on the basis 
of rather limited linguistic breadth of motiva- 
tion, to suggest a more detailed set of cate- 
gories and interrelationships. The semantico- 
conceptual representations are substantially 
more abstract than syntactic classes (as ev- 
idenced by the generalizations that they per- 
mit to be drawn) but are nevertheless tied 
reasonably precisely with possibilities for lin- 



guistic expression. An ideal situation would 
therefore be to have a very broad, detailed 
and formally specified grammar, capable of 
describing very fine-grained grammatical and 
lexical differences. 

Even despite the lack of formally speci- 
fied mappings to linguistic form within cog- 
nitive linguistics, there has still been at 
least one significant application of its pro- 
posed concepts in a computationally con- 
text. This is in their use to provide a sys- 
tem of semantic features for stating mean- 
ings to be preserved across languages in 
machine translation |Zelinsky-Wibbelt, 198 7|, 



Zelinsky-Wibbelt, 1988| . Although the work 
suffers from the lack of explicit definition 
that the conceptual categories have so far re- 
ceived — making it difficult for coders us- 
ing the semantic features reliably to clas- 
sify the meanings that are involved — this 
situation may be improved significantly by 
some current work in progress^] which is in- 
tended to improve the necessary connection 
between the semantic categories and their 
linguistic realization. The situation apply- 
ing Jackendoff 's categories in a computational 
context has, as would be expected, been 
more straightforward. A number of proposals 
have been made for such an application, and 
some have been implemented. For example, 
Meteer, 1988 comments on the possible or- 



ganization of abstract linguistic terms at the 
text message level for the sentence generator 
Mumble-86 that a system such as Jackcndoff's 
could provide and we have already seen that 



both porr, I987| , |Dorr, 1990| 's work on the 
UNITRAN translation system and approaches 
within kbmt Nirenburg and Levin, 1991] 



have implemented aspects of the semantico- 
conceptual structure.PI 



18 For example by Cornelia Zelinsky-Wibbelt and 
Wiebke Ramm of IAI/Eurotra-D on syntactic tests 
that coders could apply to resolve difficult cases. 

19 Further analogous areas of research which often 
fall somewhere between the explicit grammatical foun- 
dation attempted by Jackendoff and the, until now, 
more impressionistic linguistic motivations of Lan- 
gacker and others, include the large body of work 
on the 'conceptualization' and lingu istic expression o f 
spatial-temporal information e.g.: [ Herskovits, 198C , 
Bierwisch and Lang, 1989[ and many others. 
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4.2 Interface ontologies and lin- 
guistic theory 

In contrast to the accounts of the previous 
section, the separation of information found 
in the interface ontology and a more abstract 
conceptual ontology is consonant with theo- 
retical positions that assume a higher degree 
of stratification in the linguistic system. The 
mixed ontology view goes well with a stan- 
dard syntax-semantics-pragmatics distinction 
where 'semantics' includes the conceptual rep- 
resentation and 'pragmatics' provides pro- 
cedures that operate over the semantico- 
conceptual representation to produce active 
interpretations in context. In this sense, prag- 
matics is not a further stratum in a linguis- 
tic system and has a distinct theoretical sta- 
tus to that of syntax or semantics. In con- 
trast to this, the interface ontology architec- 
ture suggests at least a three-way stratifica- 
tion between lexico-grammatical information, 
semantic information, and a contextualizing 
level of 'conceptual' information. Each of 
these strata appears to have rather similar 
formal properties: most of the information of 
each, for example, would appear to be repre- 
sentable as a subsumption lattice defined over 
sorts, possibly augmented with structured in- 
heritance.^] 

I have already mentioned one view of the 
linguistic system that seems compatible with 
this stratification: the approach to seman- 



tics and context proposed by [Bierwisch, 1982 



Bierwisch and Lang, 198S] that acted as one 
influence for the lilog design — even though 
the final specification of the ontology within 
lilog does not seem to have remained in 
the spirit of this theory. Within the linguis- 
tic model of Bierwisch, conceptual represen- 
tations are maintained strictly separate from 
semantic representations, and semantic ex- 
pressions are used to constrain construction 
of conceptual expressions during interpreta- 
tion. Thus, 'words' (actually lexicogrammat- 
ical patterns) are related to semantic forms 
which determine functions from contexts to 
conceptual structures. The distinction be- 
tween the two levels in this kind of two-level 
semantics is nicely summarized by Michael 
Herweg as follows: 



'Semantic representations are struc- 
tured configurations of semantic units 
which, on the one hand, are deter- 
mined by the grammatical system of 
the language in question and, on the 
other hand, are grounded in — or 
motivated by — the conceptual sys- 
tem. . . . Conceptual representations 
are structured configurations of concep- 
tual units, which are mental representa- 
tions of certain aspec ts of the external 
world.' |terweg, lggj], p!52/3] 



The two classes of categories — the semantic 
and the conceptual — thus have very different 
theoretical statuses and allow very different 
kinds of motivations. This is therefore pre- 
cisely the kind of structuring of the linguistic 
system that one requires to support the use of 
interface and conceptual type ontologies. 
The most successful of the interface ontolo- 



gies described in Section 3.2, the Penman Up- 
per Model, clearly has a natural relation to 
the stratification found in this kind of 'two- 
level semantics'. For example, the sorts of the 
Upper Model are determined by the gram- 
matical system (concretely, the Nigel gram- 
mar component of the Penman system) as is 
required. Although there is also a relation- 
ship to be drawn with accounts that are ex- 
plicitly seeking semantic organizations closely 
linked to language regardless of these organi- 
zations' further embedding at higher levels of 
abstraction^ the relation between a proper 
view of the Upper Model and two-level se- 
mantics becomes even closer when we exam- 
ine instead of the Upper Model, rather the 
theoretical position of which the Upper Model 
is only a very partial instantiation: i.e., that 



of systemic-functional theory [Halliday, 1961 
Halliday, 197S|, 



Matthiessen and Halliday, forthcoming!. 

Systemic-functional theory is a highly strat- 
ified general linguistic theory with respect to 
which the Penman text generation and its de- 
scendents have been, and continue to be, de- 
veloped. In some ways perhaps analogously 
to the situation in LILOG, many aspects of the 
current implementation of the Penman system 
are not accurate instantiations of that theory. 
Of particular importance here is the very in- 
stantiation of the concept of linguistic strata 



20 Current work in i rifnrina.tinn-ha.sprl synt a.y makes 
this point for syntax [Pollard and Sag, 1987]. 



21 Or, alternatively, by seeking an embedding in an 
account such as model-theoretic semantics to 'bottom 
out' in a formally specifiable way. 
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- since it is precisely this construct which 
is necessary for motivating the kind of multi- 
levelled representation that we find in inter- 
face ontologies and their contextualizing con- 
ceptual ontologies. 

The notion of stratification in systemic- 
functional theory is depicted diagrammati- 
cally in Figure |^. The linguistic system 
is broken down here into three strata: lex- 
icogrammar, semantics, and context. Be- 
tween each stratum the same relationship 

- that of realization — holds. Systemic- 
functional theory is essentially a functional 
theory, i.e., one that is concerned crucially 
with the functions that language fulfills in 
particular contexts, and this informs the un- 
derstanding of the realization relationship be- 
tween strata as follows. Each higher-level (i.e., 
more abstract) stratum is seen as providing 
the functional motivation for the next lower- 
level stratum; and each lower-level stratum is 
seen as providing a resource that generalizes 
across the possibilities of the next-higher stra- 
tum [Halliday, 1978 1 . This gives us a more de- 
tailed view on how strata in the linguistic sys- 
tem interact than that usually found in strat- 
ified accounts. Additionally, each higher-level 
stratum is seen as contextualizing the levels 
beneath. 

The organization of the Penman-style ar- 
chitecture version of systemic theory instan- 
tiates the stratification as follows. Near- 
est the surface there are realization state- 
ments of syntagmatic organization, or syntac- 
tic form. These statements are classified in 
terms of their potential for expressing commu- 
nicative functions that are realized grammati- 
cally, such as asserting/questioning/ordering, 
active/passive, etc.: this denotes paradig- 
matic organization and is represented in terms 
of a grammatical system network. This or- 
ganization captures the possible alternatives 
that are available given any choices that 
have already been made; i.e., a collection 
of 'paradigms' of the form 'a linguistic unit 
of type A is either an A of functional sub- 
type X, or an A of functional subtype Y, . . . , 
or an A of functional subtype Z' are given. 
At each level these subtypes are disjoint and 
serve to successively classify linguistic units 
along ever more finely discriminated dimen- 
sions. This formulation of classifications in 
terms of increasingly fine discrimination is 
in systemic-functional linguistics termed the 



principle of delicacy. The grammatical com- 
municative functions are then in turn moti- 
vated by semantic distinctions that classify se- 
mantic circumstances according to the gram- 
matical features which are appropriate to ex- 
press those situations: this classification is the 
combined responsibility of choosers and in- 
quiries [Mann, 1983 1 . Finally, the possibili- 



ties for classification that the inquiries have 
arc defined in terms of the abstract ontol- 
ogy of the Upper Model. In relation to Fig- 
ure [|, then, the Penman-style architecture 
represents a computational instantiation only 
for the lower two strata and the relationship 
between them. 

While at a rather general level very simi- 
lar to the breakdown proposed by Bierwisch, 
the systemic-functional account also goes into 
more detail about the internal organization of 
each stratum, ft is this feature which is largely 
responsible both for the more abstract status 
that has been achieved for the sorts of the Up- 
per Model and for the early adoption of the 
principle of motivating sources on the basis 
of the grammar. Not only is all grammati- 
cal variation captured by abstract choices be- 
tween minimal grammatical alternatives, but 
also all such abstract choices must have ex- 
plicit motivations, or semantic conditions, de- 
fined. Only then is the grammar fully defined 
as a resource for grammatical expression: we 
have to know what each grammatical possi- 
bility is an expression of. This has naturally 
given rise to the notion of covering the gram- 
mar in terms of a set of motivations for each 
choice that the grammar offers. This is de- 
picted graphically in Figure [|. The categories 
necessary for this motivational covering are 
then organized into sorts in a subsumption 
lattice — thus defining the Upper Model. 

ft is worth noting that this provides a 
very strong methodology for interface ontol- 
ogy construction. Until a grammar alterna- 
tion is explicitly connected into a motivational 
relationship, the alternation is considered to 
be only formally (in the sense of linguistic 
form) defined. The grammar in fact acts as 
a (highly structured) list of phenomena that 
require semantic motivation. In addition, the 
functional organization of the grammar itself 
goes a long way towards providing a useful 
pre-classification of syntactic phenomena so 
as to be amenable to systematic semantic in- 
terpretation. The extra boost in abstraction 
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Figure 5: Stratification with systemic- functional linguistics 




semantics 



lexico- 
grammar 



Figure 6: Covering the grammar semantically 
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that the grammar offers is responsible for the 
increased level of abstraction that the Upper 
Model has already achieved. 

I have noted, however, that the Upper 
Model does not instantiate the full organi- 
zation required by the theory. Some of the 
consequences of this have already been men- 
tioned. For example, the upwards relations 
of the Upper Model to context has not been 
modelled within Penman in terms of a 'real- 
ization' relationship: domain models are di- 
rectly subordinated to the Upper Model hi- 
erarchy. I will return to this and some other 
problems below. Importantly, since the full 
input of the theory has not yet been taken 
into account, we have available a number of 
possible directions for development that may 
provide a far more sophisticated implementa- 
tion of both interface and conceptual ontolo- 
gies. 

4.3 A comparative evaluation 

In this section, I will apply the possible lin- 
guistic theoretical underpinnings that could 
be provided for the different ontology types in 
order to consider those ontology types more 
critically. I will suggest here that there are 
clear reasons for dispreferring accounts that 
adopt a mixed ontology approach. Subse- 
quently, in the next section, I will discuss 
possible future developments for interface and 
conceptual ontologies drawing further on the 
connections to theory established. 

We have seen that the type of account based 
on a style of argumentation such as that of 
Jackendoff manages to gain abstractness while 
still maintaining contact with details of lin- 
guistic realization. I have noted that the in- 
crease in abstraction is a generally necessary 
property for improving the functionality of 
nlp systems. One of the principle differences 
between such an account and the linguistic 
theories supportive of interface ontologies was 
in the degree and explicitness of stratifica- 
tion. One can ask the question, therefore, 
is there any evidence for the more stratified 
view of the linguistic system? If it proves nec- 
essary, or beneficial, to differentiate between 
information that is particularly linguistic and 
the kind of information sought in accounts of 
real- world, commonsense knowledge, then a 
mixed ontology will not be sensitive to this. A 
very important issue to address is, therefore, 



whether the selection between a mixed ontol- 
ogy and a more diffcrientiated set of interre- 
lated ontologies is one which is still open to de- 
bate, or arbitrary — or are there grounds for 
deciding for one architecture over the other. 

4.3.1 Populating a mixed ontology 

If we assume that we have an account such 
as that proposed by Jackendoff, possibly aug- 
mented by a range of concepts from cognitive 
linguistics with a more formally expressed re- 
lationship to the lexicogrammar, it is still the 
case that there the resulting ontology is not 
yet very large. The number of general sorts 
that occur in, for example, [ Jackendoff, 199(| 
(i.e., not the conceptual equivalents of lexical 
items, which seem to be introduced freely), 
is less than 40: these include predicates such 

as EVENT, STATE, BE, ORIENT, PATH, GO, 
WITH, FROM, TO, TOWARDS, INCHOative, RE- 
ACT, AFFect, etc. Most conceptual items are 
decomposed into these 'primitives'. I will 
not discuss whether or not these items are 
good candidates psychologically for concep- 
tual primitives, but relying on this small set of 
categories is unlikely to capture many general- 
izations of linguistic expression when a broad 
lexicogrammar is considered. If we include ex- 
perience such as that obtained within the de- 
velopment of the LILOG ontology or the Upper 
Model, many intermediate sorts will express 
useful generalizations over distinct linguistic 
patterns. Relying on a smaller than neces- 
sary set of categories either misses general- 
izations or places more work on the mapping 
with lexicogrammatical form. The method- 
ological question arises of how the sort hier- 
archy is to be extended beyond the very gen- 
eral categories that most attempts at ontol- 
ogy construction assume as basic on intuitive 
grounds. 

The primary source of evidence for exten- 
sion is the classification of lexicogrammati- 
cal patterns. This posits semantic features 
that co-occur with particular classes of Lex- 
ical Conceptual Structures. But these classes 
are constructed on the purely syntactic lin- 
guistic behaviour of the investigated lexemes. 
This, while being the best methodology avail- 
able and one I have defended throughout this 
paper, cannot itself be expected to give rise 
to conceptual classes. Only the assumption 
that such semantic patterns are simultane- 
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ously conceptual makes this plausible: there 
is no obvious connection to be drawn between 
aspects of domain and commonsense knowl- 
edge and lexically derived categories. The 
latter are often subject to criticism for be- 
ing too shallow even for an interface ontol- 
ogy: they must appear very unlikely can- 
didates for a conceptual ontology. As we 
have found with the p roblems with the Upper 
Model (Section 3.3.1 ), there is no guarantee 
that particular domain-motivated categories 
will choose lexically-motivated categories that 
belong to a consistent more general ontologi- 
cal type. More often items belonging to very 
different lexical classes are treated as seman- 
tically equivalent for speakers' expressive pur- 
poses. Representing this in a single ontology 
then requires that concepts may be consis- 
tently classified along the two dimensions si- 
multaneously: which complicates the formal 
properties of the resulting ontology consid- 
erably since exactly what may be inherited 
where becomes unclear. 

This is shown concretely in the linguisti- 
cally motivated evaluation that [ Lang, 199l| 
undertakes for lilog. There he examines the 
sorts proposed for the ontology according to 
the kinds of motivations accepted for their in- 
clusion. He finds the following differentially 
motivated sorts all combined in the single sub- 
sumption lattice: 

• 'Conceptually based sorts' which are in- 
cluded on extra-linguistic (conceptual) 
grounds. 

• 'Text base specific sorts' which are con- 
cepts corresponding to special vocabulary 
items required by the particular domain 
and text with which lilog as a project 
was concerned. 

• 'Sorts projected from the grammar' 
which are notions found in the grammar, 
such as preposition, transferred to the on- 
tology. 

• 'Sorts of mixed origin' which are concepts 
where both extra-linguistic and linguistic 
criteria are involved. 

This mixing of motivations organizes itself 
loosely according to the vertical and horizon- 
tal dimensions in the hierarchy. Thus, 



'The vertical structure of the sort hier- 
archy, which is based on the subsump- 
tion relation, draws mainly on the avail- 
ability of corresponding linguistic la- 
bels categorized as nouns ... or as verbs 
. . . However, the horizontal dimension 
of the sort hierarchy, that is the selec- 
tion of subsorts to be assigned to a com- 
mon supersort, is mainly determined by 
features that emerge from our extra- 
linguistic conceptual knowledge of ob- 
jects and spatio-temporall y specifiable 
events or situations ...' |Lang, 1991, 
p466/7] 



Lang shows the following problems for the 
resulting organization in a single subsump- 
tion lattice that this inconsistency, or variety, 
of motivations for concepts in the hierarchy 
creates. First, since extra-linguistic or con- 
ceptual criteria are less than well understood, 
there is a degree of arbitrariness in the catego- 
rizations that appear. Second, it is never clear 
from the concepts that are found in the hier- 
archy alone whether they are to be expected 
to have a corresponding linguistic effect or 
not. Third, the co-existence of distinct kinds 
of concepts means that the precise meaning of 
'subsumption' with respect to particular cases 
is underspecified — different kinds of concepts 
have different relations between their 'wholes' 
and their 'parts' and until this is clarified it 
is unclear what kind of subsumption actually 
holds. These differences entail different formal 
properties so that different objects can call for 
different inheritance properties. Thus, for ex- 
ample, a supposedly general 'part-whole' re- 
lation is intended sometimes as 'is a compo- 
nent of, sometimes as 'is spatially included 
in', somtimes as 'pertains to', sometimes as 
'inalienable possession', sometimes as 'alien- 
able possession', etc. This range of possibil- 
ities makes the inferences that in fact follow 
from any statement in the ontology far more 
difficult to foresee and substantially compli- 
cates in any case any axioms for inference that 
are designed. 

This can also be seen concretely in many 
versions of semantics where a mixed on- 
tology is relied upon — in order to han- 
dle the very flexibility of the relationship 
between the concepts that are to func- 
tion for the linguistic expression, and those 
which are not, complex and often uncon- 
strained mechanisms are introduced: the 
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'projective inheritance' of Pustejovsky, 199l"| 
and many instances of 'type coercion' 
as used by, e.g., pag and Pollard, 1991 



Pustejovsky, 1991] are probably prime exam- 
ples of this, but there are many others. A 
mixed ontology is, therefore, a very weakly 
constraining theoretical construct, which does 
not provide optimal assistance either for the- 
ory construction or for system construction. 

4.3.2 Stratification 

The just mentioned flexibility of relationship 
between conceptual categories and the cate- 
gories that are determinative of their linguis- 
tic realization is a very typical property of a 
relationship between linguistic strata. It is 
this very flexibility, in fact, which provides the 
primary linguistic evidence for stratification. 
As an example of this, consider the following 
issue of ontological design. 

Regardless of whether a mixed ontology is 
adopted or not, some portion of some on- 
tology is assumed which offers an expression 
of the chunking that language expects and 
demands of knowledge if it going to be ex- 
pressible through the grammatical and lexi- 
cal resources of the linguistic system. One 
question that can be asked, therefore, is: is 
the information in a conceptual ontology that 
will support this chunking already organized 
in this way or not? If it is then it will 
be straightforward to construct a mechanism 
such as that suggested by Jackendoff above, 
whereby one simply 'takes a view' on some 
conceptual structure and already has a spec- 
ification of the semantic predicate-argument 
structure which can in turn control the gram- 
mar and lexis to produce appropriate results. 
If not, however, then some reorganization of 
the structure will be necessary. In all ex- 
amples that are presented of alleged concep- 
tual structures that are already appropriate 
for direct lexicogrammatical realization, e.g., 
by viewing as predicate-argument structure, 
we can make the following observations: 

First, the lexical items and class of gram- 
matical patterns appropriate is already so 
highly constrained as to follow directly from 
the expression; for example, 

(State ORIENT ([ Thing WEATHERVANE] , [p ath 

north])] 



ervane pointed north [ lackcndoff, 199p| , p74]. 
Certain variability in lexicogrammatical ex- 
pression will be produced by the mapping 
rules of syntax formation, but other decisions, 
including: the choice of word for the concept 
weathervane given that the hearer might not 
know what a weathervane is, or that the sen- 
tence may be uttered among world-experts 
on the subject of weathervanes who would 
normally select a far more restrictive descrip- 
tion, etc. have already been built into the de- 
scription. Widely differing selections of pos- 
sible expression according to text type, reg- 
ister, formality, situation, time availability 
(cf. Hovy, 1988 , Bateman and Paris, 1 ]) are 
excluded. 

Second, the granularity of the correspond- 
ing language has also been built into the de- 
scription. For example, we know that a sen- 
tence is going to be produced (or if the linking 
rules are good enough: a sentence or a nom- 
inalization) rather than a short discussion of 
the wind's effect on an object whose position 
of equilibrium under the pressure of the wind 
serves as an indication of the wind's direction. 
A nice example^ of maximal flexibility here 
might be the difference, for example, in the 
language produced in response to the concep- 
tual real- world category beer for the purposes 
of a dictionary entry, e.g., 

'Beer is a bitter alcoholic drink made 
from grain. There are a lot of different 
kinds of beer.' [Collins COBUILD En- 
glish Language Dictionary, 1987] 

and that produced for the purpose for an entry 
in an industrial chemical encyclopedia, which 
goes on for 40 pages. 

The response to both of these problems 
within the semantico-conceptual approach is 
straightforward: the differences are expressed 
beforehand in the semantico-conceptual orga- 
nization and are produced by conceptual pro- 
cesses for information organization and man- 
agement. But this misses the generalization 
that regardless of the information to be ex- 
pressed that same linguistic granularity is im- 
posed: there will be a set of descriptions of 
some predicate with an argument structure, 
including specifications of participants and 
circumstances. The two sentences concerning 
beer in the dictionary and the hundreds in the 



as one reading for the sentence: The weath- 



'Due to Karin Haenelt. 



29 



encyclopedia all exhibit the same kind of orga- 
nization. Knowledge is variable scale, but lan- 
guage is predominantly fixed-grain,^] as de- 
fined by the grammar. This means that for 
all the knowledge available in the semantico- 
conceptual ontology, there need to be con- 
struction mechanisms available which convert 
some selected fragment of the information, of 
any scale, and produce an appropriate sized 
chunk of semantico-conceptual structure for 
motivating a sentence. 

With unconstrained inferencing across the 
knowledge base this may be achievable 
by inheriting constraints back from the 
grammar and checking the equivalence of 
constructed semantico-conceptual structures 
with the originally selected fragment. But, 
crucially, for all such selected fragments, the 
same class of 'semantico-conceptual' para- 
phrases will be potentially available: i.e., 
those licensed by their grammatical express- 
ability. Furthermore, also regardless of the 
originally selected semantico-conceptual frag- 
ment, the lexico-grammatically licensed set 
of 'semantico-conceptual' specifications gov- 
ern specifiable sets of inferences that oper- 
ate only on such specifications: for exam- 
ple the inferences that determine the tex- 
tual variations that are appropriate when 
realizing the specification lexicogrammati- 
cally [Bateman and Matthiessen, to appear |, 
that certain abstract semantic classifications 
apply for which there is no conceptual ev- 

and others 



idence [Schriefers, 1990 



Thus, 

not representing this distinguished set sepa- 
rately fails to capture a significant generaliza- 
tion about the organization of the linguistic 
system as a whole £j 

23 Apart from the resources for combining clauses, 
nominal groups, etc. into 'complexes', which are not 
relevant to the current argument. 

24 It is also engenders dubious NLP system design; 
factoring out the commonalities in a separate stratum 
is analogous to the following application of object- 
oriented programming: 

"In an object-oriented application . . . the 
system uses predefined mappings from ob- 
jects to the routines that know how to pro- 
cess those objects (or can choose among dif- 
ferent routines depending on the context). 
The efficiency of using predefined mappings 
for known types comes in drastically reduc- 
ing or entirely eliminating search; the onus 
is put on the developer to define the de- 
cisions available to a type at each level, 
rather than presenting all options at all 
times and letting a search procedure find 



Finally, it is worth emphasizing that this 
flexibility between strata is typical and not 
unique to the relation between semantics and 
conceptual levels of representation. The re- 
lationship between, for example, the Upper 
Model and the lexicogrammar already ex- 
hibits much of the same kind of flexibility. For 
example, the expressive resources of the gram- 
mar of nominal groups is not restricted to the 
single grain-size of sorts that are subtypes of 
an Upper Model sort object. It is equally pos- 
sible to realize Upper Model classified events 
as nominal groups or configurations of events 
as single clauses if the textual conditions are 
appropriate. Bateman and Paris, 1] present 
other examples of this theoretical flexibility 
for other categories in the grammar. It is not 
at all surprising given the theoretical similar- 
ity, therefore, to find exactly this kind of flex- 
ibility again between the sort lattice of the 
semantic ontology and that of the conceptual 
ontology. 

This discussion of stratification is summa- 
rized in Figure 0. Here we see three strata and 
the repeated variability in expression that any 
selected semantic specification has. Crucially, 
the common, reoccuring coding possibilities 
that are available for all elements from the 
conceptual stratum are not repeated at that 
level, but are factored into a single statement 
at the level of the semantic interface with a 
mapping from sorts at the conceptual stra- 
tum to sorts at the interface stratum. Not 
representing this generalization both guaran- 
tees a complication of the theory and makes 
a usable NLP system based on the theory un- 
likely. Again, the power of the theory to bring 
methodological and contentful constraints to 
bear on system design is compromised. 

5 Some Principles and 
Methods; and some for- 
mer puzzles resolved 

The discussion up to this point has attacked 
mixed ontologies on the basis that they are 
internally inconsistent, and has criticised the 
non-statificational linguistic accounts under- 
lying such mixed ontologies on the basis that 

the best one. " jMctccr, 198E| , p6] 

This is also one property of using an interface ontology 
such as Meteer's or the Upper Model. 
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Figure 7: Capturing generalizations via stratification 



they fail to capture theoretically important 
and practically useful generalizations. Both 
weaknesses have one consequence in common: 
they provide a seriously reduced set of con- 
straints for ontology design and construction. 
Since one purpose of this paper is to sug- 
gest principles and methodologies for ontol- 
ogy construction, mixed ontologies and under- 
stratified linguistic accounts are clearly to be 
avoided. The kinds of ontologies most appro- 
priate for nlp systems and for which linguis- 
tic support needs to be sought can now be 
restricted to the following two types. 

• Type Of. an abstract semantic organiza- 
tion underlying our use of grammar and 
lexis that is motivated on essentially lin- 
guistic grounds and that acts as a com- 
plex interface between lexicogrammatical 
resources and higher-level strata in the 
linguistic system — the categories of this 
interface should be maximally general, 
i.e., apply across distinct real- world situ- 
ations, but specific enough to maximally 
constrain possible lexicogrammatical ex- 
pression. 

• Type O c : an abstract organization of 
real-world knowledge (commonsense or 



otherwise) that relates downwards to the 
interface to lexicogrammar. 

With these restrictions in place, I will now 
go further and suggest some particular guide- 
lines for ontology construction. In order to 
do this, however, it is also necessary to make 
some further commitments to the kinds of in- 
formation that will be made available at par- 
ticular levels in the linguistic system. The rea- 
son for this is that the linguistic, and particu- 
larly lexico-grammatical, constructs are essen- 
tial for guiding ontology design. This follows 
the increasingly wide range of linguistic the- 
ories that are returning to the position that 
the relation between grammar and semantics 
is not arbitrary; we saw a selection of these 
in Section p] above, e.g., [ Langacker, 1987 , 

ralmy, 1987[ |Wierzbicka, 1988; , 

Jackendoff, 1983| , [Halliday, 1978[ . If we "ac- 



cept this, then it is also to be accepted that 
the selection of particular accounts of lexi- 
cogrammar has consequences for the subse- 
quent ontology design. Since such conse- 
quences cannot be avoided, it makes sense to 
make selection decisions in ways which will 
maximally help in the task of ontology con- 
struction overall. I will distinguish between 
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decisions in the following two areas: 

• type of grammar 

• contents of grammar 

and will make some firm suggestions for the 
former and discuss the consequences of differ- 
ences that arise in the latter. 

First, we can note that the most successful 
interface ontology developed so far is prob- 
ably that of the Upper Model. The Upper 
Model has achieved both a detailed account 
and a generally applicable account. We can 
ask, then, what is it about its underlying the- 
oretical organization that let this occur? Sec- 
ond, we can further note that although the 
Upper Model is the most detailed instantia- 
tion of an ontology of type Oi that has been 
developed, is nevertheless not a full instan- 
tiation of the theory on which it is based. 
It is therefore worthwhile considering briefly 
what additional constraints the theory could 
bring to bear if it were to be more fully im- 
plemented. 

The kind of grammar on which the ab- 
stractions proposed in the Penman Up- 
per Model is easily classified. It is a 
paradigmatic-functional grammar exhibiting 
the standard Hallidayan metafunctional diver- 



sification [ 


Halliday, 1985, 


Matthicssen, 1990 


Matthicssen and Batcman, 1991 . 



This means that it is organized, firstly around 
choice — the paradigms of grammatical con- 
structions that stand in functional opposition 
— and second around a factorization of that 
choice according to its semantic motivation: 
is the choice to do with the propositional con- 
tent of the linguistic entity to be classified 
{ideational}, is the choice to do with the tex- 
tual placing of the linguistic entity to be clas- 
sified (textual), or is the choice to be classified 
as to do with the interpersonal relationship 
between speaker and hearer or with the atti- 
tude of the speaker towards the information 
expressed by the linguistic entity (interper- 
sonal). The motivations for the choices pro- 
vides hypotheses concerning the sorts neces- 
sary for controlling those choices. The Up- 
per Model has been derived by considering 
motivations for those choices exclusively as- 
signed to the ideational metafunction: there is 
no mixing of categories across metafunctional 



domains.^] 

This builds into the design of an ontology 
motivated in such a way as the Upper Model 
the following features. 

First, we require an ontology that is sig- 
nificantly more abstract than syntactic real- 
ization classes. I have already suggested how 
this has been achieved with the Upper Model. 
The grammar, being organized in terms of a 
functional classification of possible constraints 
on constituency structure, is already more 
abstract than constituency structure per se. 
Further classification across the paradigms 
uncovered is then automatically more abstract 
and achieves a generalization across particu- 
lar lexico-grammatical contexts that supports 
a greater flexibility of expression of input 
expressions. The strict relationship to the 
grammatical stratum also makes sure th at the 
kinds of mixed sorts that Lang, 1991 ] finds 
and criticises in the lilog hierarchy cannot 
occur: either an (interface, i.e., semantic) on- 
tological category has a specified consequence 
for lexico-grammatical expression or it is not 
accepted. 

Second, given the stratification suggested 
by the theory the Upper Model is automat- 
ically only the 'next level up' in the linguistic 
system: it is an ontology strongly connected 
to grammar below. It does not, by itself, pro- 
vide the necessary organization of higher level 
ontologies. Thus, in short, we see that an or- 
ganization closely reminiscent of a two-level 
semantics is automatically achieved, and that 
both levels require ontologies. 

Third, we have seen that it is a design 
goal that an ontology be as general as pos- 
sible — that it helps with classification across 
domains, tasks, and applications, but also 
be substantial enough to provide a rich scaf- 
folding for domain description. This raises 
the question: How can we guarantee that 
a proposed ontology is as general as we re- 
quire? We can now see that ontologies such 
as the Upper Model, which are based on 
motivations from grammar, are guaranteed 
to have the domain-independence required 



25 Although it is perfectly possible to imagine ap- 
plying the same 'grammar-as-filter' methodology on 
underlying motivational ontologie s as carried out for 
the ideational metafunction — cf. [Bateman, 1991] for 
exa mples of this applied to the textu al metafunction 
and [Matthicssen and Bateman, 1991] for general dis- 
cussion — the resulting organizations of information 
have very different properties. 
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of them. Since ontological categories are 
motivated by the grammatical distinctions 
(and not by more arbitrary lexical collec- 
tions found in a given domain), those cate- 
gories are forced to be at least as general as 
the grammatical categories. It is, therefore, 



very unlikely that [Klose and von Luck, 1991 



p462]'s claim that the lilog ontology has a 
'more domain-independent status' than the 
ear ly version of the Upper Model described 
in | Mann et ai, 1985] would apply to current 
versions of the Upper Model. 

But we can go further and move beyond 
the kind of generalizeability that refers sim- 
ply to domain-independence — which is gen- 
eralizeability 'upwards' in the linguistic sys- 
tem, and beyond generalizeablity across the 
lexicogrammar — which is generalizeability 
'downwards' in the linguistic system. When 
we also consider the metafunctional organi- 
zation of the linguistic system posited by 
systemic-functional theory, then we can see 
that generalizations both across 'text in- 
stances' and across 'speech functions' are also 
guaranteed — i.e., generalizations 'horizon- 
tally' across the same stratum of the linguistic 
system. This is depicted graphically in Fig- 
ure |^. These constraints rule out certain other 
potential sorts from the ontology, e.g., sorts 
concerned with the particular appearance of 
an entity at a given position in a text or with 
the speaker's attitude tow ards an event. Cer- 
tain of the sorts found in [ Meteor, 1989 |'s in- 
terface ontology are good examples of the for- 
mer kind. Having such sorts requires reclas- 
sification of domain information whenever a 
domain object is used in a text, since the tex- 
tual statuses of domain objects changes over 
the development of a text — i.e., from new to 
given, from theme to rheme, etc. This change 
of course needs to be represented: the point 
is that representing such information in the 
interface ontology again mixes very different 
kinds of information although this time on 
a 'horizontal' dimension across the linguistic 
system rather than a 'vertical' one. 

The kind of grammar that we employed as 
the initial motivation for guiding the develop- 
ment of the Upper Model has, therefore, gone 
a long way towards ensuring that the prop- 
erties desired of ontologies obtain. But an 
area of flexibility in the description then arises 
from the depth of grammatical description, 
i.e., the contents, rather than the type. Par- 



ticularly within the systemic-functional ap- 
proach, lexical descriptions are seen as more 
specific versions of grammatical descriptions 
— there is no difference in kind. Thus, if we 
push lexicogrammatical description further in 
the direction of lexis, we automatically push 
further the depth of motivating semantic on- 
tology constructs that are needed. This bi- 
furcation in potential description needs more 
theoretical work before we can make any firm 
statements about whether it is more helpful 
to pursue one at the expense of the other, or 
whether they should be pursued in parallel as 
has been the case with the more general area 
of the grammar. 

We can now also consider some possible im- 
provements and explanations for some awk- 
ward phenomena/intuitions that have previ- 
ously hindered ontological engineering. For 
example, if there is a stratification of the kind 
argued for, why is it that suggestions for con- 
ceptual structure that have been put forward 
in a number of approaches appear also to be 
candidates for representation as sorts in the 
interface ontology? — When the categories of 
the Upper Model, for example, are examined, 
many similar classes to the proposed 'concep- 
tual' ontology work are to be found. 

To give a concrete example of this, 



Lang, 199l| , p474], after careful discussion 
concerning the problems of a mixed ontol- 
ogy, defines some basic assumptions con- 
cerning the structure of the conceptual on- 
tolo gy drawn from earlier work, includ- 
ing [ Bicrwisch and Lang, 1989| . With respect 
to these assumptions, he outlines the following 
set of conceptual domains which are to form 
basic subsorts of the conceptual ontology: 

Di: objects; T>2- substances; D3: locations; 

D4: time intervals; D5: events; Dq: attitudes 

We can also note here similarities with some 



of the classes above from [ Jackendoff, 1983 
Jackcndoff, 199C| , Langackcr, 1987 



Lalmy, 1987 , etc. But these are also sorts 
already found, for example, in the Penman 
Upper Model, where they have been entered 
purely on the grounds that they are necessary 
to directly constrain possible grammatical re- 
alizations. Is it the case that the claim we saw 



above by [Gust, 1991, pl33] that: 'there are 
continuous variations between semantic forms 
and conceptual structures' is, after all, true? 
Can we introduce strict stratification and still 
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account for the intuition that these concepts 
indeed function at different strata? 

Lang already suggests that there may be 
certain genuinely 'linguistic' features that 
function definitionally for features at the 'con- 
ceptual' stratum: 

' . . . the representation of nouns like 
Of en [oven], Fahrzeug [vehicle], Boot 
[boat] in the lexicon contains a spe- 
cific component PURPOSE (hence, an el- 
ement of our linguistic knowledge) by 
means of which the sort Nutzgegenstand 
[article for practical use] in the knowl- 
edge base is being accessed. This is but 
one example of how linguistic aspects 
of lexical representation can be made 
use of in defining o ntological so rts in the 
knowledge base.' [Lang, 1991, p470] 



Other 'genuine linguistic features' that Lang 
suggests for the basis of the ontological dis- 
tinctions include: 'bounded object' vs. 'non- 
bounded object', 'concrete object' vs. 'ab- 
stract object' — both very similar to other 
theoretical accounts. We can now go further 
and explain the relation between the linguistic 
(semantic) ontology types and the conceptual 
ontology types as follows. 



All of the reasoning that we have applied 
to the development of the Upper Model ontol- 
ogy with respect to its motivation in the lexi- 
cogrammar can be applied precisely to the re- 
lation between the Upper Model ontology and 
some higher stratum ontology. This follows as 
a consequence from the theoretical statement 
of the nature of realization within the strat- 
ified linguistic account. This means that we 
will need to find motivations for the seman- 
tic interface ontology sorts. It also means, 
however, that we can make use of the realiza- 
tion relation starting from the standpoint of 
the higher stratum and interpret the status of 
the semantic interface ontology as generalizing 
across different conceptual stratum situations; 
cf. Figure || Thus, for both the lexicogram- 
mar with respect to the semantic interface on- 
tology, and for the semantic interface ontology 
with respect to the conceptual ontology, it is 
likely that the more general intra-stratal or- 
ganization of the lower stratum is likely to be 
echoed in the overall intra- organization of the 
higher stratum. This gives us the observed 
link between constructs that are motivateable 
as general semantic concepts and constructs 
that appear to organize the conceptual hier- 
archy. There is, then, no 'mixing' of the cat- 
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egories of different strata, there is just a res- 
onance or echo of categories at one stratum 
taken up at another. 

Given both this theoretical and practi- 
cal binding of the contents of the dif- 
ferent strata, is it clear why there is 
then a certain tension between strata - 

p462] 



as [Klose and von Luck, 1991 



note 

from their experience with the lilog ontol- 
ogy: 

'The tension between linguistic and in- 
ferential demands on the modeling is 
alive and forces compromises on both 
sides.' 

I have suggested that the kind of view of re- 
alization between strata found in systemic- 
functional linguistics, where there is both a 
practical and a theoretical 'pulling' in both 
directions — upwards to context and down- 
wards to (experiential) lexicogrammar, of- 
fers an appropriate way of operating within 
this tension between strata. The resulting 
methodology then uses the tension to help 
constrain organization decisions for the con- 
struction of interface ontologies that are use- 
ful for NLP and to remove the need for genuine 
'compromises' where an inappropriate cate- 
gory is postulated at one level because that 
level is insufficiently functionally differenti- 
ated from others. 

It is clear, however, that we know a great 
deal more about possibilities for ontologies of 
type Oi than we do about ontologies of type 
O c . Moreover, given the results of the last 
section, perhaps we know even less than we 
thought — clearly conceptual categories are 
now sometimes best reappropriated to a more 
abstract semantic type. This is a less than 
ideal situation — particularly given the view 
of stratification shown in Figure || and the es- 
tablished dialectic between strata. Because 
the realization relationship between strata is 
bi-directional, we should be able to use a 
higher-strata to constrain our accounts at a 
lower-strata. But the fact that we know very 
little about the higher-strata in this case re- 
moves one source of possible constraint. 

Finally, here, however, I will draw atten- 
tion to one interesting consequence for the 
status of the higher-stratum ontology when 
we take into account the bi-directionality of 
the inter-stratal relationship. Since there is 
no difference assumed in the theoretical sta- 



tus of the levels related by the interstratal re- 
lationship, one might ask how it is that the 
interface ontology is termed 'linguistic' and 
'semantic', whereas the higher-stratum ontol- 
ogy is 'non-linguistic' and 'conceptual'. I be- 
lieve that a far more appropriate view of the 
relationship is as depicted graphically in Fig- 
ure ^. All strata that stand in an interstratal 
relationship of the kind explored and used in 
this paper should be seen as semiotic levels 
of greater and lesser degrees of abstraction. 
The conceptual ontology thus becomes more 
of a contextual ontology, with context being 
interpreted in the sense of a level of social 
situation — closely in line with, for exam- 
ple, | Halliday, 1978 . There is, then, the ad- 
ditional question of how this entire complex 
of inter-related levels of semiotic descriptions 
relates to the supporting conceptual system 
of human psychology. This is probably a very 
different kind of relation than realization - 
although it will probably again turn out to be 
a dialectic relationship rather than a one-way 
determination. This puts us in the position 
to criticise some of the conceptual sorts pro- 
posed by Lang on exactly the same grounds 
that he has criticised mixed ontologies. For 
example, alongside the above mentioned do- 
mains, all of which may be more plausi- 
bly gr ound in the conceptual/perceptual sys- 
tem, | Lang, 1991 , p474] places: 'social insti- 
tutions (law, administration, marriage, edu- 
cation)' and communicative behaviour (eti- 
quette, conversation, group dynamics). Such 
a mixed set of sorts is unlikely to form a very 
stable or usable ontology: it is probably cru- 
cial to begin to refine further our levels of 
ontology, and their interactions, so that the 
mistakes made at the least abstract levels of 
ontological engineering are not just repeated 
again, at the next level 'up'. More detailed 
statements must, however, be left to future 
research! 



6 Summary, conclusion 
and final words 

This discussion of this paper has considered 
the notion of 'ontology'. Starting from the 
view that an ontology is an organization of 
the world — which has been approached 
by 'naive physics', 'conceptual dependencies', 
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'commonsense (meta)physics', and others — I 
drew attention to the fact that such accounts 
do not bring strong methodological and sub- 
stantive constraints to bear on ontology con- 
struction. Also unclear is the relationship of 
such ontologies to language. The gap is of- 
ten so large that this level is too abstract 
to have any direct relationship to required 
forms of expression. Contrariwise, this gap 
also leads to a weakening of the discriminative 
power of the constraints that can be brought 
to bear by linguistic patternings. Concretely, 
then, one cannot, for example, generate nat- 
ural language directly from such levels of de- 
scription without resolving, or 'fixing' an im- 
mense number of degrees of freedom that re- 
main unaddressed (often quite rightly, if this 
is seen as a conceptual ontology) in the on- 
tology itself. Much of the work that an NLP 
system requires to be done is, therefore, sim- 
ply not taken into consideration by the ab- 
stract ontology. Such ontologies are also, be- 
cause of their abstractness, difficult to popu- 
late reliably — if sizeable and potentially dis- 
tributed resource construction is undertaken, 
as it increasingly is, then this virtually guar- 
antees poor intercoder consistency. In short, 
such ontologies are of very limited value for 



nlp work. 

These problems have been noted by some 
of those who have sought principles for on- 
tology design (cf. | Skuce and Monarch, 199(j| ) 
and those who need real shareable resources 
(as for example in machine translation - 
cf . [ |5tcincr and Rcuthcr, 1989| ) . The only so- 
lution that has been found to this endeavor is 
to place more reliance on language as a source 
of constraint. For this reason, then, views on 
language and the organization of the linguis- 
tic system become crucial for ontology design 
that is appropriate for nlp. Moreover, only by 
taking views on the linguistic system that are 
maximally supportive of the functionalities re- 
quired of ontologies can we avoid problems of 
lack of abstractness (i.e., being dominated by 
linguistic form) and problems of too much ab- 
stractness (i.e., being dominated by semantic 
theories of particular areas that lack connec- 
tion to linguistic form). In short, ontological 
engineering faces the following dilemma: in- 
terface ontologies 

• need to be abstract, large-scale, re-usable 
information classification devices, 

• but they cannot be too abstract, 
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• or too near syntax, 

• and need to be constrained from lan- 
guage. 

The theoretical assumptions and resulting or- 
ganizational decisions that I have pursued in 
this paper appear to offer a very practical way 
of preceding within this state of affairs. I have 
also shown that several other beneficial prop- 
erties for NLP systems are derivable from the 
abstract organization of the linguistic system 
that systemic-functional theory posits. 

The paper has presented for broader debate 
a round of discussion that begun in the con- 
text of the developing ontology of the Penman 
text generation system. This work, begin- 
ning with the pre-computational descriptive 
account, called the Bloomington Lattice by 
Halliday and Matthiessen has passed through 
several instantiations in computational form. 
Now future work will have again consider 
bringing together the linguistic descriptive 
account — reworked to a new level of de- 
tail in [ Halliday and Matthiessen, to appear| 
- and the computational model, ft is to be 
hoped that this approach will build on the for- 
mer success of the Upper Model, simultane- 
ously moving us in some of the directions that 
I raised as responses to problems with the Up- 
per Model. Thus, I have not suggested that 
the Upper Model we find in Penman is the 
'general solution' to ontological engineering — 
there are many more criticisms to be made of 
this ontology, again mostly concerning the ex- 
tent to which it succeeds as an instantiation of 
the theoretical principles that underlie it. The 
junction of the ontology is also more finely cir- 
cumscribed than many others — but again 
strictly according to the underlying theory. 
We are not yet at a stage where an ontology 
can be accepted, even pragmatically for the 
needs of current nlp systems, as 'complete': 
what is more at issue is the development of ap- 
propriate methodologies for constructing on- 
tologies, and here again constraints offered by 
the linguistic system are of paramount impor- 
tance. The linguistic system, when viewed 
appropriately, gives a rich multidimensional 
set of constraints on adequate and appropriate 
designs for computational systems. The prin- 
ciple dimensions applied in this paper were 
those of strata and metaj 'unctions. This by no 
means exhausts the possible input of the the- 
ory, however. For further dimensions of the 



theory, see [Matthiessen and Bateman, 1991 
for additional examples of using these dimen- 
sions to constrain computational system de- 
sign, see [Bateman et al., 1992]. I hope that 



the paper has suggested some of the benefits 
of employing such linguistic motivations, and 
that further attempts to apply wider sets of 
motivations will help us in the future. 
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